{HE ANNALS 


of 
MATHEMATICAL 
STATISTICS — 


non ane Sree eee 


PASe 
Hypothesis 1 
An to the r-Way Crossed Classification 
Fog <ocariprmewe r-Way Tb 
Second Order Rotatable Designs in Four or More Dimensions 
- Nouman R. Drarzs 
A Metts Gubetliation Metis of Cosstrunting Fuetine Serene oe 
tn oe ey, 
R. Bamapur 
On the Mixture of Distributions 
On Interchanging Limits and Integrals 


mt 
a 


2 
a 
43 
55 
74 


Sume of Small Powers of Independent Random Variables 
_ 3. M. Suarmo 222 
On the Median of the Distribution of Exesedunces. . .....K. Samxavt 225 
(Continued on back cover) 
a 





THE ANNALS 
ATHEMATICAL STATISTICS 


aE 


North Carolina, Chapel Hill, N. c. chenahndl citaes oasak eae eam omaireon 
given issue of the Anaale should be reported to the Secretary on or before the 10th of the 
month preceding the month of issue. 


Editorial Office. Department of Statistics, Eckhart Hall, University of Chicago, Chi- 


should be submitted to the editorial 


double 
top, and bottom, and the original should be submitted with one additional copy, on paper 
that will take corrections. Dittoed or mimeographed papers are acceptable only if com- 
pletely legible. Sectnotenchauld be talinsed sa’ aabuitans, hot obese placltte caalaiedie 
remarks in the text, or a bibliography at the end of the paper; formulae in footnotes should 
be avoided. References should follow current Annals style, and should be numbered a:pha- 
betically according to authors’ names. 


, charts, and diagrams should be professionally drawn on plain white paper or 


solidus; thus (a + 6)/(e + d) rather than o+a' 


Authors will ordinarily receive only galley proofs. Fifty reprints without covers will be 
furnished free. Additional veprints and covers will be furnished at cost. 


ie to the hen as Seen Bel tg abet See We Be et Tn Shee on Oe 
Composup amp Printed af THB 
WAVERLY PRESS, Inuc., Bauemrons, Masriann, U. 8. A. 
Sescnd-class postage paid st Baltimore, Maryiaad 





EDITORIAL STAFF 


Epitor 
WILLIAM KRUSKAL 


Associate Epitors 


ALLAN BIRNBAUM DONALD A. DARLING N. L. JOHNSON 
Z. W. BIRNBAUM WASSILY HOEFFDING OSCAR KEMPTHORNE 
DOUGLAS G. CHAPMAN J. L. HODGES, JR. E. L. LEHMANN 


WITH THE COOPERATION OF 


J. F. Daur Samvug. Karin G. E, Nozruazr 
Crrus DerMan Harry Kesten Howarp Ratrra 
J. L. Doon C. H. Krart H. E, Rossains 
Mesyver Dwass SoLtomon Kuiupack Watrer L. Suita 
D.A. 8. Fraser Evoene Luxacs Liongt Weiss 


Past Eprrors or THE ANNALS 


H. C. Carver, 1930-1938 T. W. Axnpgrson, 1950-1952 


8. 8S. Wiixs, 1938-1949 E. L. Lenmann, 1953-1955 
T. E. Harris, 1955-1958 


Published quarterly by the Institute of Mathematical Statistics in March, 
June, September and December. 


IMS INSTITUTIONAL MEMBERS 


ABERDEEN ProvinG Grounps (Ba.tuistic Research Lasoratories), Aberdeen, Maryland 

AMERICAN Viscose Corporation, Marcus Hook, Pennsylvania 

Bev. Te.erHone Lanoratonrigs, Inc., Tecunicau Liprary, 463 West Street, New York 14, 
New York 

Bogrine ArrPLaNe Company, Box 3707, Seattle, Washington 

Ca.irorniA Researcu Corporation, P. O. Box 1627, Richmond, California 

Cornet University, Matuematics DerparTMENT, Ithaca, New York 

GENERAL ANALYsIs CorporaTion, 11753 Wilshire Bivd., Los Angeles 25, California 

InpIANA University, Tue Lisrary, Bloomington, Indiana 


INTERNATIONAL Business Macuines Corporation, Appuiep Science Lisrary, White 
Plains, New York 


Iowa Strate University, Statistica, Lasoratory, Ames, Iowa 

Locxneen. Arrcrarr CorporaTION, EncingerineG Liprary, Burbank, California 

MicuicaNn State University, Department or Statistics, East Lansing, Michigan 

Nationa Security Acencyr, Fort George G. Meade, Maryland 

Princeton University, Derartment or Matuematics, Secrion or MATHEMATICAL 
Sratistics, Princeton, New Jersey 

Purpve University Lisrartes, Lafayette, Indiana 

Rapio Corporation or America, R.C.A. Lasonatonriges, Princeton, New Jersey 

Sanp1a Corporation, Sandia Base, Albuquerque, New Mexico 

Space TecHno.Locy Laporatoriges, P. O. Box 95001, Los Angeles 45, California 

Stanrorp University, GrrsHick Memoriat Liprary, Stanford, California 

State University or Iowa, Iowa City, Iowa 

Tue Carsno.iic University or America, Enc. anv Maru. Liprary, Washington, D. C. 

Tre Ramo-Wootrivce Corporation, Los Angeles, California 

Union Carsipe Corporation, 30 East 42nd Street, New York 17, New York 

Unitep States Stee, Corporation Lisrary, Monroeville, Penna. 

University or Cauirornis, Statistica Lasoratory, Berkeley, California 

University or Iiturno1s, Sertats Department, Urbana, Illinois 

University or Nortu Carouina, Department or Statistics, Chapel Hill, North Carolina 

University or Puerto Rico, Scnoo. or Tropica, Mepicine, San Juan, Puerto Rico 

Untrversity or WasHinoton, Laporatory oF Statistica, Researcn, Seattle, Washington 








THE ALGEBRA OF A LINEAR HYPOTHESIS' 


By Henry B. Mann 
The Ohio State University 


Introduction. Let y = (y:,-**, yw) be a random vector. We consider the 
following sequence of hypotheses: 


A (assumption): E(ye) = > j-1 pass, a=1,--:N. 
Hy: i= =f, =0, 


H, : Bs, +094---40--441 Pee ere Bay +---40, - 0, 


where 8 + & + --- + 8& = 8. 
Fora = 1, ---,N,l=1,--+,7r we put 
p? Daj jeate-teartil-:-,ater +a. 
(1) ’ = 0 otherwise 
= (pi) 


We consider the algebra & generated over a real field by the matrices pip: where 
A’ denotes the transpose of A. It will be seen that this algebra is closely related 
to the analysis of variance of our linear hypotheses. In particular all tests of 
sequences of hypotheses correspond to a decomposition of & into left ideals. 
Thus the study of the decomposition of & sheds considerable light on the analysis 
of variance appropriate to the linear hypothesis. The algebra & was first ,con- 
sidered by A. T. James [1] for the important case that the matrices pip are 
relationship matrices. James also pointed out that & is semisimple and hence a 
direct sum of complete matrix algebras. 

In this paper we shall first consider the general problem and show that the 
tests appropriate to the sequence of hypotheses H} = H,, Hi = H, & By, ---, 
H? = H, & H,--- & H, lead to a decomposition of & into (not necessarily 
simple) left ideals. We shall then consider the case where Wi is generated by two 
generators Pipi ‘ Dap and where moreover ( pp) = ul Pip:). (Throughout this 
paper Greek letters will denote scalars.) In this case we shall obtain the complete 
decomposition of %& into principal components. This case includes in particular 
all those incomplete block designs in which each block contains the same number 
of experimental units while each treatment is replicated the same number of 
times. We shall then be able to establish a relation between the decomposition 
of & into principal components and the power function of our tests. Finally we 

Received September 23, 1958. 


1 Also sponsored by the United States Army under Contract No. DA-11-022-ORD-2050, 
Mathematics Research Center, United States Army, University of Wisconsin. 


1 





2 HENRY B. MANN 


shall illustrate our methods by decomposing the algebra of an s-dimensional cubic 
lattice into its principal components. 
In the following the term matrix will always mean a matrix with real elements. 


1. General Theorems. 

THeoreM 1. Let p be any matrix. There exists a matriz A such that p'pA = p’, 
where p’ denotes the transpose of p. The matrix pA is moreover idempotent and 
symmetric. 

Proor: Let p have N rows and s columns. Consider the indeterminate N 
dimensional column vector y. Put E(y) = pb where b is an s dimensional column 
vector. Then for any choice of y the expression Q = >>. (ye — E(ya))* must 
have a minimum with respect to b. Differentiating with respect to each b, we 
obtain the equation 


(2) p’'y = p’pb 


which must have a solution since Q has a minimum. Since (2) is a system of 
linear equations the b’s must be linear functions of the y’s. Hence b = Ay and 
therefore 


(3) p’ = p’pA. 


Multiplying (3) from the left by A’ we get A’p’ = A’p’pA. 
Hence A’p’ and therefore also pA is symmetric. Furthermore, 


(pA)* = A'p'pA = A'p’ = pA. 


Coro.uary: If pp’ is an idempotent matrix then p'pp’ = p’, pp'p = p. 

THEOREM 2. If ao + ayz + +--+ + a,x" is the minimal polynomial of a symmetric 
matrix then either ag # 0 or a, # 0. 

Theorem 2 is an immediate consequence of the fact that a symmetric matrix 
may be transformed into a diagonal matrix by an orthogonal transformation. 

THEOREM 3. The matrix pA of Theorem 1 is uniquely determined. 

If ao + az + +++ + a,x" is the minimal polynomial for p'p, then 


pA = —(app'’+---+a,(pp’)’) if a= 1. 


4 
pA = —(app’+---+a,pp’)"") if a =0,a = 1. 

Let a = 1 then] = —(ayp’p + --- + a,(p’p)’) where / is the unit matrix. 
Multiplying this equation from the right by A and from the left by p we obtain 
the first equation of (4). 

Let a9 = 0, a, = 1, then p’p = —(a,(p'p)’ + --- + a,(p’p)’) and we obtain 
the second equation of (4) by multiplying left by A’ and right by A. 

Corouiary: The matrix pA of Theorem | lies in the algebra generated by pp’. 

We now consider the sequence of hypotheses H,, H, & H,, ---, H, & Hy 
& --- & H,. Put 





ALGEBRA OF LINEAR HYPOTHESIS 


Pyp=Mm+met+-::+ +P, 
a Pea mtn te 


P, Dy 
We solve 
(6) Pi = PiPA,. 


The vectors Y“” = P,Ayy are the regression values corresponding to the 
hypotheses H, & H, & --- H,, (i = 1--+1r). The decomposition 


Devi = y= PrAdy + y'(PiAi — PoAs)y + + 
+ y(PpaArs — P,A,)y + y'PAvy 


where J denotes the identity is the proper decomposition of the sum of squares 
for testing the hypotheses H, & --- & H, [2, p. 33]. 

If the parameters 8; of our linear hypothesis are subject to restrictions and if 
there exists a solution 6 = A, y satisfying the restrictions then since P,A, is 
unique by Theorem 3 the decomposition (7) will still be the appropriate de- 
composition for the analysis of variance, although the degrees of freedom will 
have to be adjusted. Thus all our results will remain applicable to this case. 
If the least square equations are solved by the method of Lagrange operators 
the existence of solutions of the least square equations which satisfy the restric- 
tions means that the Lagrange operators may be ignored. A very important case 
of this type is treated in Theorem 4.4 of [2]. 

4% Corresponding to (7) we have, as we shall show, a decomposition of & U / 
into left ideals. 

We have 


I - (I -« P,A;) + (PiA, = P,A:) + os + (P,1Ay thee P,A,) + PA, 


We have p,P; = Dp: for 1 2 j hence from (6) we get 


(7) 


(8) pp = ppiP,A; = PsApp, for iz j. 


Now by Theorem 3 P,A; is a polynomial in P,P; = >-1-, pp. . Hence for 
t = j we have P;A; P;A; = P;A, PA; = P;A; and thus the idempotents e,; = 
PA; —_ Ppp Aiuali = 0, 7° Ore. PoAo = # P,yA r+ = 0) are a set of orthogonal 
idempotents. Hence [3, p. 147, Problem 4] the decomposition WM = We; + --- + 
%.2, is a representation of & as a direct sum of left ideals. These left ideals are 
however not always simple left ideals. 

TuroreM 4. The algebra & generated by Dip: i = 1---r has the unit element 
P,A, . The matriz PsP; has an inverse in UM. 

Equation (8) shows that PA, is the unit element of &. Equation (4) may 
be written 


P\Ay = PyP\(—a,P\A; — aPyPi + «++ —a,(P:P3)"™) 





4 HENRY B. MANN 


if P,P’ is nonsingular and 
P\A, = P,P\(—a,P\A, + «++ —a,(P;P;)**) 
if P:P; is singular. 
Let now & be generated by one matrix pp’. We assume first that pp’ is a 
diagonal matrix. Let \, ,--- , A, be its distinct characteristic roots. Then 


(1 0 --- 0) 
QO - 


| 


oar — AJ) _ 


(ry - A2)° “(Ay “= An) 


Lo 0) 
B, = (9e — M1) ---(pp — deaT)(pp" — Monsl)-- “(9p = Pal) 


1 —~ WE (Ag , ae Ae) (Asqa Siac As) °° “On bm Ax) 
are orthogonal idempotents and the decomposition 
(10) WM = ME, + --- + WE, = {wi} + --- + fueks} 


is a decomposition of & into principal components. 

If pp’ is not diagonal let 7’ be an orthogonal matrix such that Tpp’T" is a 
diagonal matrix. The isomorphism pp’ — T pp’ T’ is a faithful isomorphism. 
Hence the decomposition (10) with the #; given by (9) is still a decomposition 
of & into principal components. 

In considering the general problem we may therefore assume that the matrices 
Dip are idempotent. We shall also assume r > 1. Let & be generated by the 
idempotent matrices Pi, Pap2, 2 o.2 D:D. « 

Turorem 5. If for G # 0,G ¢ YU, G, ¢ A we have pap; iG = uG, PiPi Gi = mG; 
fori=1--+,rthenp = 1,G; = aG. 

We have oat = (pp.)’G = wG = w'G. With P,A, defined by (6) we have 
by Theorem (4) ?:4,G = G hence » # 0 and »w = uw implies » = 1. 

If B is any element of % we may write 


= >a,B; 


where the B; are monomials in Pip wee Dep. and 
(11) BG =aG; a= dia. 

Since W is generated by symmetric matrices, A ¢ & implies A’ ¢ & and so 
(12) G’G = dG. 


For any matrix M ~ 0 we have M’M = 0 hence in (12) A ¥ 0 and (12) shows 
that G is a symmetric matrix. Thus G;G = aG = a*G,. If a = 0 then in the 





ALGEBRA OF LINEAR HYPOTHESIS 5 


representation of G, by monomials we have >a; = 0 and therefore Gi = G :G,= 0 
whence G,; = 0 = 0G. If a* + 0, a # 0 we have G; = a/a* G. This proves 
Theorem 5. 

If G = dG we may replace G by G/A. Hence we may assume that G is idem- 
potent and decompose % into 


(13) W= (A — AG) + AG = (A — ACG) + faG}. 


The one dimensional two sided ideal {aG} is a principal component of & and 
in & — AG the element 0 is the only element G, for which pp.G, = G, . 


2. Algebras generated by two idempotent generators. If PP has an inverse in 
w then Pipi = PA; and the algebra becomes trivial. Hence we may assume that 
PrP and therefore also Pipi Paps Pipi are singular. 

awe 6. Let the algebra W be generated by two idempotent generators pip, 

. Let T; = pipi Paps pp: , Tr = Paps Prpi pap and let M(x) = x(x — ») 
mt — \,) be the minimal polynomial of T; . Put 


a Sa ahs (Ty = da) 
1 Ppr =I)" Ae 


(TT. — ¥y)-++ (Ts — An) 
Ve SP Sip et 


_— TT, — &)-- (1) ee — Aasi)*-*(T1 — da) 


Na(Ne — A)***(Ne — Ao-t)(Ne — Aest)-°°Ce — Ae) 


_ T™ — )---(T2 Nat) (T's — Aapt) °° ‘(T, — dn) 
~  NalNe — Aa)--*(Ne — Aent)(Ne — Awst)°°°"(e — An) ” 


a = G af Na = 


(” ade (a )? 


~ (a _ -—4@) fe ¥ 1. 
a ~ as f 


Then 

(i) M = AF, + AF, + 505_, Wf. is the decomposition of % into principal com- 
ponents. (One or both of the components WF; , WF may reduce to 0.) 

(ii) pripik; = F,, pip.F2 = 0, pipsF, = 0, piprF’s = F2, pypniG = papG = G. 

The algebras AF, , AF, , AG are comple’ 1 XK 1 matrix algebras or zero. 

(iii) For \. # 1 the algebras B, = Wf, are complete 2 K 2 matriz algebras. 

Proor: It is clear from (9) that F;, F2, 4”, 4” are idempotents. 

Furthermore Fipp: = = ppl; = F,, Fipwpipapapip = 0. By Theorem 1 we 
have from this Fipipip, = 0, and so 


(15) Fipipipop = F pop: = 0, 


and by transposing pap2F', = 0. Hence F;, and similarly F; are in the center of &. 
That F; , F; and the f, are orthogonal follows from the following Lemma. 





HENRY B. MANN 


Levus | For any polynomial H(z) = x(x) we have 


Pipi (pypyppipyp2) = H (pypypypp.p:) pep - 


This follows easily since the relation 


PiPi(Popoprpipapr)” = (Diprpxpsprp:) "Paps - 


holds for every m > 0. 
If \. = 1 put G = G, and G, = &”. 
We have (7; — 1)G,; = (T, — 1)G, = 0. 
Hence 


(16) TG, = G, ’ TA = Ge . 


Puttingc = G, — G,psp: we find from (16) ce’ = 0, hencec = 0 andG,; = Gipsp1 
= Gypyp2 and similarly G, = Gsp,;p; = Gzpyp; . Hence by Theorem 5 G, = G, = G, 
where G satisfies the relations of Theorems 5. 
Now let Aw # 1. We have 
prada” =a”, ppa” = spp. 
(17) 


, (a) (a) (a) (a) , 
Pra" G& ;, Pipiéz = € Pepr, 
T,.a” =A a", T,4° = hoe": 
Thus 
qa a” = hea” 
and 


a) 


, , , , (a) 
fap. = MPifa = € , SaPxp2 = Pprfa = 2”. 
This shows that f, is in the center of U. Moreover Tifca) = Aa a fda @ ie”. 
Using these relations one easily finds f, = f.. We show next that the direct 
sum AF, + AF, + > Af. contains the algebra &. From Lagrange’s interpolation 
formula we have setting \» = 0, 


> M(x) 


@ — r)M".) ~ 
Substituting in this identity 7, for z and PA, for the unit element and multi- 
plying by Pipi we get 
Dapp + Fi = pip 
and similarly 
Lfepipr + FP: = prpr. 


Hence UF; + AF; + > oAf. contains both generators of % and therefore % 
itself. 





ALGEBRA OF LINEAR HYPOTHESIS 


Every element of the algebra Af. = 8. may be written in the form a;e,” + 
(a) (a) (a) (a) (a) wp 
ai2€2 + aye, € “t- Qa€2 € . The elements 


(a (a) (a) (a) (a) (a 
€ 


€; a €2 = .& 
fu = foo = 


——— Oe 


f(a) (a) (a) (a) (a) (a 
€ € — Ag& €&. €& — Age 


fu = ee — : ’ iu = ea 


1 — X. 
satisfy the condition fifa = fu, Sider = 0 forj # k. 
Hence we have the isomorphism from 8, onto a complete two dimensional 


matrix algebra 
. 0 0 
Siu “? ’ Se or ’ 
0 0 0 1 


0 1 0 0 
fi - , Su ~- . 
0 0 1 O 


This completes the proof of Theorem 6. 

Coro.uary: If the scalar field of U is real then the principal components of U 
are real. 

This follows since 7; is a symmetric (even positive semi definite) matrix. 
Hence its characteristic roots are real (even non-negative). 

In applying Theorem 6 to concrete situations it is often of advantage to re- 


place the matrices PP by Dp — G. In the algebra & — G obtained in this way 
one then has A. # 1 for all a. 


3. Relations to tests of hypotheses. If f is a principal idempovent of the 
algebra W& generated by MPi,***, DP. then f’ is also an idempotent of the 
center of &. Since f'f # 0 we must Lave f’f = f, hence f is symmetric. Since 
every idempotent of the center is a sum of principal idempotents it follows that 
all idempotents of the center are symmetric. 

The significance of the decomposition into principal components is pointed 
up by the following theorem. 

TueoremM 7. Let P\A; = 1, + I2 where I; , 12 are orthogonal symmetric idem- 
potents of A. The idempotents I, , I, belong to the center of U if and only if for every 
matriz P such that PP’ ¢ U, the relations 


(18) PPA = PF’, P’'T,PB, = P’T, , P'1:PB, = P'l; 
imply 
(19) PA = IPB, + I,PB,. 


Proor: By Theorem 3 we have PA ¢ 4. If J; is in the center of & we get 
from (18) 


(20) PI, = P’PAI, = P'1,PA. 





HENRY B. MANN 


Because of the uniqueness of the regression values (Theorem 3) this implies 
(21) I,PA = I,PB, 
and similarly 1,PA = 1I,PB, if J, is in the center of A. Hence 
I,PA + 1,PA = P,A,PA = PA = I,PB, + 1:PB; 


if J; , J, are in the center of Y. 

To prove the sufficiency of (19) put P = p;. Multiplying (19) from the left 
by J; gives Iip;iA = I,p,B,. The matrices p;A and J,p,B, are symmetric by 
(18) (see Theorem 1). Hence J;p;A = p,Al,. Moreover p;A = Dp (see the 
corollary to Theorem 1), and therefore J ‘Pipi = ppil 1 . Since this relation holds 
for every value of i, the matrix J, and similarly J, are in the center of W. 

Theorem 7 shows that it is sufficient to study the tests of linear hypotheses 
for each of the principal components separately. 

We return now to the case of two idempotent generators. 

Every vector a = (a, --- ,a,) can be decomposed uniquely into two parts 


(22) a=a(I — PA) + aP,A,. 


Turorem 8. If aQ = a for some element Q of the algebra U generated by pip; , 
pyp: then a = aP,A, and hence a(I — P,A;) = 0. 

For we have a(IJ — P,A,) = aQ(I — P,A;) = 0. 

DEFINITION: A form az, @ = ,-°**, Gy, 2 = %,°**, tw 18 called totally 
confounded or confounded with coefficient 1 in UN if a # (0, --- ,0) and 


(23) apipi = app = a, 
it is called confounded with coefficient X # 1 if 
(24) a(pip: — pxpr)* = (1 — Ada. 

If \ = 0 then a is called unconfounded. The rows of G are all totally con- 
founded. The rows of Ff; and F; are unconfounded. The rows of f, are confounded 
with coefficient A. . . 

Multiplying (24) by p:p: from the right we get 
(25) OprPip2P2PiP1 = Nap: 
and similarly 
(25a) ApspsP\piPxp2 = appr . 


THeoreM 9. Let a be any vector then 


(26) a = a) + ay + am +a; + Dae 


where 
(i) gop: = aopyp2 = 0, 
(11) ayPr1P1 = Gyo , QoP2p2 = 0, 
AeoPops = Ag , Anprapi: = O 





ALGEBRA OF LINEAR HYPOTHESIS 


(ili) a, ts confounded with coefficient \. ,a = 1, 
where the dr are the distinct characteristic roots on De a, and \, = 1 if 1 is 
ac.r. The decomposition (26) is unique. 

Proor: We have the decomposition 


(27) l=I1-PRA+hithetht DK Se. 


Multiplying (27) by a we obtain (26). 

That the decomposition (26) is unique follows from Theorem 8 and from the 
following two lemmas. 

Lemma 1: If ais confounded with coefficient \ then = d, for someaanda = af, 
where we set fo = Fy, + FP. 

Lemma 2: If app, = a, apy: = 0 thena = aF,. 

Proor or Lemma |: Since a is confounded it follows from Theorem 8 that 
a(I — P,A;) = 0. For \,. # 1 we have 

of. = a (PPL— PrP)” 5 gid, 


1 — da = i-i’,.°" 
Hence af, = Ofor\ # \,. . For \ # 1 we have 


( ’ ‘9 
ans Pipr P2P2) 


G = 0. 
i—A 


Hence since a ~ 0 we must have A = \, for some a and multiplying (27) by a 
we find a = af,. 

Proor or Lemma 2: From Lemma 1 we have a = afy = a(F; + F:). Multi- 
plying from the right by pip: we have appr =a=aF,. 

Theorem 9 shows that the rows of f. span the space of all those linear forms 
which are confounded with coefficient A. . The rows of F,(t = 1, 2) form the 
space of all those forms az which are unconfounded and for which apg.z = az. 

We shall now consider the power of the tests of our linear hypotheses and it 
will be necessary to assume that the reader is familiar with the theory of testing 
linear hypotheses and with the power functions associated with these tests. 
For the concepts and results that will be used in the following the reader may be 
referred to [2] Chapter IV, pp. 22-30 and Chapter VI. It will be seen that the 
power of the tests is closely related to the confounding coefficients. 

Suppose that we have observed a set of linear forms Qy where Q is an idem- 
potent matrix of the center of Mand y’ = (ym, --* , yw). In testing the hypothesis 
H; : br = = B,, = 0 under the sasaenption E(y) = pb” + pp”, where 

a” = (i, °°, By 0, +++, 0), B®” = (0, +++ ,0, B41, °** » Berteg) and the 
oan assumptions of a linear to as stated on page 23 of [2]; using the 
forms Qy we first have to solve the equation 


(28) (pi + pr)Q = (pi + ps)Q(pi + pr) Bi. 
The quadratic form 
(29) ¥'(Q — Q(pi + pr) Bi)y = Q, 





10 HENRY B. MANN 


divided by its rank A, forms the denominator of the statistic PF. We then com- 
pute the regression value of Q under the assumption and the hypothesis 
H,: 8; = --+ = 8,, = 0. That is to say we have to solve the equation 


PQ = piQp.B . 


Since Q is in the center of Uf this equation can be solved by putting Bz = pa (see 
the corollary to Theorem 1). Hence 


QpB:, = Qpyp: = QpipQ. 
We then put 
(30) y'(Q 7 Qprb>)y = Q,, y' Qray = Q, a Q. . 


The matrix Q,. is orthogonal to QppQ (see the paragraph following equation 
(8)) and so by Theorem 1 


(31) PiQQ,-. = 0. 


Hence if instead of the forms Qy we substitute in Q, — Q, their expectations 
Q(p.8 + ps”), under some alternative hypothesis HT we obtain 


(32) 6° 'p:Q,-.p.8 as 2075 


where o° is the variance of one observation and 6 is the quantity denoted by A 
in formula 6.37 of [2]. If Q,.. has the rank h; then the power of the F test is a 
monotonically increasing function of 5/h; and of hy. (See formula 6.37 of [2] 
and the paragraph following it. To avoid confusion with the confounding coeffi- 
cients we have written 6 instead of \.) Moreover, if he is fairly large the increase 
in power obtained by increasing A» is negligibly smail. We shall therefore call 
26/h, = p the power index with respect to Hf. 

If Q is not orthogonal to Dep a certain amount of power is lost in eliminating 
the parameters 8,,4:,-** , 8s,40, . To measure this loss we consider the power 
index of the test of the same hypothesis H, but under the assumption 8,, 4: , - 
8,,4*, = 0. This will result in another power index p*. The ratio 


. 
, 


(33) e=* 


p 


is called the efficiency factor of Q with respect to H? . 

Now let f. be an idempotent of the center of & with confounding coefficient 
ha. (If A. = O let f. = F,). Testing the hypothesis H, under the assumption 
E(y) = pB” + prB gives Q. = fa — faPapr - 

Hence 20°5 = 8°” (pifaD: _ Pif aPxp2p1)8"”. Now PiPif eP2P2PrP1 = aPiDif aPrP 
and on account of Theorem 1 we obtain 
(34) Dif Ppp: = raPifaPr 


so that 


(35) 20° = (1 — da) B” pif ap.8”. 





ALGEBRA OF LINEAR HYPOTHESIS 11 


On the other hand if the assumption is changed to A & H, then the matrix 
of Q, becomes f. — fapip: and the matrix of Q, is f, and hence 


(36) 20°8* = 6°’ pif ps”. 


For \ = 1 we have f,. — faPapr = © so that no test is possible. For \ # 1 
we have 


(37) Dipi(fa — faPxp2)Pxpi = (1 — da)fapri 


which shows that rank (f. — fePxp2) = rank (fapipi) so the efficiency of the 
matrix f. is 1 — Aq. Hence 

TuHeorem 10. Jf f, is a principal component with confounding coefficient \,. # 1 
and if for \. = 0, fapipi = fa then the efficiency of f. with respect to every alterna- 
tive hypothesis Ht is 1 — da. 

From (37) we also have 

THEOREM 11. Jf \ is any confounding coefficient then0 S X S 1. 

Proor: fa — faPxpr as well as faPrD1 are symmetric idempotent matrices and 
therefore positive semi definite. Also faPiP1 ~ 0. Hence (1 — A.) 2 0. Similarly 
(34) implies A, 2 0. 

If we increase the size of the sample by replicating the experiments, then the 
quantity 20°6/h, will be increased in direct proportion to the increase in sample 
size. If we neglect the increase in power do to a corresponding increase in hz 
we can interpret Theorem 10 as stating that \, is proportional to the amount 
of money spent in eliminating the parameters §,,4:, --- , 8,4, . In a situation 
where the inhomogeneity of the second parameter set could be eliminated at a 
given expense the confounding coefficients \, could therefore be used to decice 
whether the elimination of inhomogeneity is really worthwhile. 


4. Applications. A. T. James [1] has considered the important case in which the 
coefficients p.; are either 0 or 1 and where with S; = 8 + --- + 8, we have 
8; 


pe Camm 
j=S;_1+1 
The matrix pp: = T: = (73) consists in this case of ones and zeros only. 
We have T'; = 1 if for some j we have pay = ps; = 1 otherwise TS) = 0. Such 
matrices 7; are called relationship matrices since T(3 = 1 if and only if the 
ath and #th plot (experimental unit) receive the same treatment from the Ith 
set of treatments. Applying the matrix T; to the vector y = (w:,°--, yw)’ 
will replace every y. by the total of those observations which receive the same 
treatment of the /th set as y,. . If every treatment of the /th set is repeated the 
same number say k; of times then applying the foregoing remark to the columns 
of T; itself we get Ti = ki:T1 so that T,/k; = t; will be idempotent. The matrix 
t; applied to y replaces every observation y,. by the mean of those observations 
which receive the same treatment as ya. 
A. T. James has given the decomposition for balanced incomplete block design. 
lf the design is asymmetric, r > k, then one obtains three one dimensional and 





12 HENRY B. MANN 


one 2 X 2 complete matrix algebras as principal components of the algebra 
(4 U J). If the design is symmetric then one of the one dimensional algebras 
(the algebra YF; of [1] p. 1000) reduces to 0 since in this case BTB = (r — X)B 
(Mod. G). It may be left to the reader to obtain this decomposition from 
Theorem 6. 

In the following we shall decompose the algebra of an s dimensional cubic 
lattice design into its principal components. This example exhibits all the features 
of the general case and at the same time does not present any computational 


difficulties. 


5. The principal components of an s dimensional cubic lattice design. In an 
s dimensional cubic lattice design m’ treatments are arranged into s sets of 
blocks each containing a complete replication. The blocks are formed in the 
following way. The treatments are distinguished by a set of s indices and are 
written t;,...;,, 1 S % S m,1 S i, S m. In the first replication the blocks are 
formed by keeping the indices i, , --- , ¢, fixed and varying the first index. In 
the ath replication the blocks are formed from all treatments with indices 
,°** , G1, Gey: *** @, fixed. Thus every replication contains m*™ blocks of 
m treatments each. For instance for s = 2,m = 3 we have the blocks 


(tu , tn, ta), (tu , te, ta), 
(tie , ter , tse), (tor , too , tes), 
(tis , tes , tes), (tar , tae , tee). 


The values observed for the treatment ta,...2, in the ath replication will be 
denoted by (a)2e,...0, - By (a)x3!...* we shall denote the sum of all observations 
with ist index a, , isnd index a2, +--+, tuth index a, and we shall call such a 
quantity a class total. The assumption reads 


B( (x) Zay---0,) ™ bay-.0, 4 (ee) be; -..05-10051°°- 


@,* 


(Usually the restriction >-,,.....0,te,. = 0 is imposed and a general mean 
introduced, but since by Theorem 4.4 “of [2] the Lagrange operator for this re- 
striction is 0 we may ignore it and add the general mean to the block effects. 
By Theorem 3 this does not affect the regression values. ) 

We form according to Section 3 the matrices 7 relating two plots with the 
same treatment and B relating two plots from the same block. 

From Section 3 we have 


B((1)ze,...0,) = (120,27 =-2 (1)a,,-:- 


B((a)2a,..-0,) = (a)x2,: edie a 


TB((a)2a,.--0 ) X (a)a5, -a—la+l- 


' *@q— eon" 


Thus we have 





ALGEBRA OF LINEAR HYPOTHESIS 


Proposition 1: If (1)xi;"%a, = (@)xa,..o, = Za,..-0, then 


(38) TR alee:=,.) « 2° GS ee - 


A class total (a)za)//'3, is called confounded if a # ij,j = 1--- u. Let 


>i" denote the sum of all confounded class totals with s — k indices chosen 
out of a,,---, a,. For instance 
Do? = (1)at + (12h + (2)28 + (2)a3 + (3)2i + (3)zt. 
Proposition 2: Fork < 8 
(39) TBS = mk + kD. 
Proor: We put (a)zs,..'s, = > i" “* and apply proposition 1. We obtain 


(40) TBD = Dy, DP + Dy Dee + Ds, Det, 


In the ith sum every class total not containing the upper index / occurs m 
times. Therefore since there are k upper indices missing in every class total 
occurring in any sum )>_,'”* every class total with k upper indices missing and 
s — k lower indices chosen out of a, , --- , a, will occur mk times on the right 
of (40) giving rise to the first term on the right of (39). The class totals with 
k + 1 upper indices missing arise from those terms of the [th sum which contain 
the upper index / but do not have the prefix | (since terms with prefix and upper 
index | are not confounded). Hence each such term arises from exactly k of the 
terms in the right of (40). This proves (39). 

Let 2, denote the transformation which replaces (a)z.,.:'s, by Zi''**. We have 


TB = %,, 
(TB)’ = TB2, = mz, + 2, = mTB + 2. 
Suppose we have shown that fork < s — 1 

(41) TB(TB — m) --- (TB — km) = kl Dens. 
We multiply (41) by TB and get 

(TB — (k + 1)m)k Zea: = (K + 1) !2ese. 
Hence we have proved 
(42) TB(TB — m) --- (TB — km) = kitins for KO 8-1. 


Since >°$"""** = z = sum of all observation we get 


G say. 





14 HENRY B. MANN 


and from (42) fork = s — 1 
(43) TB(TB — m) --- (TB — (s — 1)m) = (8s — 1)3G. 
Dividing (43) by (ms)’ we get 


8 ; ? 


where t, b, g are the idempotents corresponding to 7, B and G. 

Remembering the effect of 7B we see that one application of 7'B can delete 
only at most one upper index in a class total. Hence s applications of TB are 
needed to produce a term with all indices deleted. On the other hand 
(TB)*ze.-- 5, for k S s involves terms which are not involved in [tr te ; 
Hence a polynomial in 7'B of degree less than s cannot vanish nor be a multiple 
of G. Thus if we putt — g = t,,b — g = b, then 


(45) ty by (‘sb - ‘) coe (a io seo ') = 0 
8 8 


is the minimal equation of t,); . 
From (45) and Theorem 6 the decomposition of the algebra of the s dimen- 
sional cubic lattice can be obtained without any effort. 


6. The case r > 2. 
A part of Theorem 6 carries over easily to the case r > 2. If there is a matrix 
G e¢ A satisfying the conditions of Theorem 5 we may write 


Y= A-— AG + AG. 


If there is no G # O satisfying Theorem 5 we shall put G = 0. To exclude 
trivialities we also assume that Dip: is singular. Using these conventions we 
can state 

THEOREM 12. Let 


Q; = Pi — pi, 


(46) a 
Ti = ppQQipp . 


Let t= : *-,7,a@ = 1--- n, be the distinct non 0 characteristic roots of 
T,;. Let 


p, = (Tix) += (Ti = dae) gt 
(47) (—1)"*AiY --- ALY Pi 





‘> 


e& = ppi- Fi -G 
and let B be the algebra generated by e; , --- , e- . Then 


(F; for i=j 


( i 
Se ae Pm fat 


(i) PPPs =} 9 for ij’ 
(ii) A= AG+ AP, + --- + AF, + &. 





ALGEBRA OF LINEAR HYPOTHESIS 15 


(iii) G, Fy, --- , F, are annulled by 8 and are principal one dimensional com- 
ponents of A or are equal to 0. 
(iv) The equation 


(48) >, €sBi = &; 


It 


has a solution B, ¢ B. 
Proor: From 


(49) TF; = 0 


we get multiplying by G, (r — 1)GF; = 0. Hence GF; = 0. 
From (49) we get on account of Theorem 1 


(50) QF; = 0. 
From the definition (1) of p; we find 


(51) pQ: = pp; for j # i, 


and so from (50), pwpF = Ofori # j. 

From (9) we see that F, is idempotent, so that (i) is proved. 

The statements (ii) and (iii) are immediate consequences of (i). By (47) 
we may write mod. G 


G& = T (ao Py Ay + a Ti + +++ Qn T?) = «(2 4) B, 
1) 


with B, e B. This completes the proof of Theorem 12 since e, G = 0. 


REFERENCES 


{1] A. T. James, ‘The relationship algebra of an experimental design,’’ Ann. Math. Siat., 
Vol. 28 (1957), pp. 993-1002. 

[2] H. B. Mann, Analysis and design of erperiments, Dover Publications, Inc., New York, 
1949. 

[3] B. L. van pER Warrpen, Modern Algebra, Vol. II, Frederik Ungar Publishing Com- 
pany, New York, 1950. 





AN OPERATIONAL APPROACH TO THE 1r-WAY CROSSED 
CLASSIFICATION’ 


By J. D. BANKIER 
McMaster University 


1. Summary. An operational method is used to obtain known formulas [2] for 
the expected values of the mean squares and the variances of estimates of vari- 
ance components obtained from the analysis of variance of an r-way crossed 
classification. The results are independent of normality assumptions. 


2. Introduction. We begin with the results and notation of a book by Mann 
[4], considering an r-way classification with replication. An observation is 
denoted by 2. = 2a,.--a,,;, 4 = 1, «++, t,, where the last subscript is used 
to indicate replications. Main effects and interactions are represented by 
u(T, ar) = pltis, «+, he 5 ai, -**, ay), Where J = (4, ---, %) is a subset 
of R = (1, ---,7), and w(/, a;) = wif J is the null set 

We now introduce two operators, D; = D,, which drops and a; from u(R, ae) 
and M,; = M,, which averages a function over a, , if a; appears, and otherwise 
leaves the function unchanged. These operators are commutative with respect 
to themselves and each other and all the ordinary laws of algebra, excluding 
division, hold. It is assumed that 


E(ze) = (1 + D)eu(R, ae), 
M;,u(J, a;) = 0, j=1,-:-,k, 


where (1 + D)zg = (1 + D,)---(1 + D,). It is easy to establish 
Lema 2.1. Independent of condition (2.2), 


(2.3) M,D; = D;, (1 — M,;)D; = 9, Mi(1 + D,:) = M. + Di, 
3 ; 


(1—M,)(1+ D,;) =1—-— M;. 
Subject to condition (2.2), 
(2.4) (M, + Di)u(R, az) = Duw(R, az), (1 — M;)p(R, az) = u(R, az), 


provided that these expressions are not multiplied by D,; . 
Mann establishes an identity (4, p. 52] 


tMzr = (1 + D)SS(R + 1) 


Received July 17, 1957; revised June 17, 1959. 
1 This investigation was supported (in part) by a research grant from the National Re- 
search Council of Canada. 


16 





T-WAY CROSSED CLASSIFICATION 17 


where R +1 = (1.---,r+1),t = tha = t+ ba, (1 + D) = (14+D) esi, 
(2.6) SS(7) = 1Q(1)/t, = tM,A’*(I, a;), 
(2.7) ACUI, a;) = Meyis(l — M)i2,, 
- tis, Mes-1 = My +++ Mega/Mi, «++ Mi, and 
(1 — M), = (1 — M,,)---(1 — My). 


The following lemma is an immediate consequence of definition (2.7). 
Lemma 2.2 A(J, a;) = M,(té4 — 1); 2» where 


( tb ap oF 1); - (ti,8a, 5, 7 1) +++ (tides ds, = 1). 


The above notation and operators can be employed to simplify considerably 
the usual derivations of the analysis of variance for the r-way crossed classifi- 
cation. As an example, we prove the identity (2.5). Making use of the equations 
(2.6), (2.7), and Lemma 2.2, we have 


(1 + D)SS(R + 1) = (1 + D)ag,,MoA*(R + 1, aay) 
t(1 + D)op,,MoMiM (tba — 1) n4i(tbac — 1) n412ee 
(1 + D)ag,, MoM (tine — 1) 4st 
tMiM.(1 + thse — 1) n4itete = tMyz?. 


3. The Type I model. We define «, by the equation z, = E(z,) + « and make 
the usual assumptions for the Type I model [1, p. 348] save that we do not as- 
sume that the «, are normally distributed. It will be sufficient if they are inde- 
pendently distributed with zero means, a common variance, o’, and such other 
moments as we require. In addition to the sum of squares SS(/), we will be 
interested in the error sum of squares, SSE, which is obtained by summing ail 
SS(/) for which the last index is r + 1. It is easy to verify, using the operators, 
that 


(3.1) SSE = tM[(1 — M,4:)20}. 


The corresponding mean sums of squares, MS(J) and MSE, are obtained by 
dividing the above sums of squares by their degrees of freedom which are (t — 1), 
and te(t,., — 1), respectively. In the case where t,,, = 1, it is necessary to as- 
sume that the u(R, az) are zero. In this case the value of SS(J) is unchanged, 
but SSE is SS(R) and the corresponding degrees of freedom are (t — 1)x. 

We now state and prove two lemmas. 

Lemma 3.1. For the Type I model 


(3.2) A(I, az) = p(T, a;) + Mes-7(1 - M ) 16. 
Proor. We deduce from (2.7), and (2.1) that we must prove 


eT, ar) = Meys(l — M)s(1 + D) ew (R, ae). 





18 J. D. BANKIER 


By (2.3), (2.4), and (2.2) 
Meu-1(1 — M);(1 + D)gu(R, az) = Me_j(1 — M),(1 + D)a_w(R, az) 
= (1 — M),(M + D)z_w(R, apg) 
= (1 — M),Dpg_w(R, az) = (1 — M) (I, a;) = w(I, az). 
Lemma 3.2. For the Type I model 
(3.3) E(Mass-1(1 — M)yeq) = (t — 1),0°/t. 


Proor. The desired expected value is necessarily of the form co’ where c is 
a constant which does not depend upon the form of the distribution of the «,, 
which we may assume to be NID(0, o’). Under these conditions, SS(J) is dis- 
tributed as x’o” when the u(J, ay) = 0 and the result (3.3) is easily established. 

As an almost immediate consequence of these lemmas we obtain: 

TxHeoremM 3.1. For the Type I model 


(3.4) E(MS(1)] = o° + to’(1)/tr 


where o° (1) = t;M yp’ (I, ay)/(t — 1),. Whent,4, = 1, E(MSE) = o° + o°(R), 
and, otherwise, E(MSE) = o’. 

Assuming that t,,; > 1 and equating mean squares to their expected values, 
we obtain the unbiased estimates of the variance components 


o(1) = t{MS(I) — MSE/t, o = MSE. 


Examination of (SSE)’ indicates that E(SSE)’ = ki, + key: where k; and k, 
are independent of the distribution of the «,. Accordingly, we may assume that 
the «, are NID(0, o’) and obtain a linear relationship between k, and k, since 
SSE is then distributed as xe. A little computation determines k, and leads 
to the conclusion that var(o’) = (Hs — Sur)/t + Qy2/te(tra: — 1). 

Making use of (3.2), we find that o (J) = o (1) + D + F where 


D = 2t;M yu (I, bre /(t —_ 1);7, 
F = t;M,M | (td. = 1) ‘(t —_ 1); — (t4180, 4 167-41 — 1) dbep/(tr41 —_ 1) ese « 


We note that the quadrs cic form F contains no squared terms, E(D) = E( DF) = 
E(F) = 0, E(D*) = 46MiMa(I, br)w(1, cr)bseo"/(t — 1)) = 4tzo*o*(1)/ 
t(t — 1),, and a similar, but longer, calculation gives 
E(F’) = 2{1/(t — 1)1 + 1/te(trar — 1)]u2/?. 
These results lead to the conclusion that 
vario'(I)} = 4tyo’o" (I) /t(t — 1), 


(3.5) 


+ 26[1/(t — Vr + 1/teltrss — L)Jo°/l 





T-WAY CROSSED CLASSIFICATION 19 


4. The Type II model. The usual assumptions are made for a Type II model 
[3] save that we make no normality assumptions. We recall that equation (2.2) 
no longer holds. We state 

Lemma 4.1. For any model, 


A(I, a1) = Mel — M)A1 + D)r m(R, Gr) + Meyr(l — M)n,. 


Proor. This relation was established in the proof of Lemma 3.1. 
This lemma and the method used to prove Theorem 3.1 leads at once to: 
TuHeoremM 4.1. For the Type II model, 


(4.1) E(MS(1)] = o° + t(1 + D)e_so*(R)/te), 


where o°(R) is the variance of the u(R, ar). 
Equating mean squares to their expected values and solving we obtain: 
Lemma 4.2. For a Type II model, without normality assumptions, 


o(1) = (—1)*(1 — D)as{MS(R)I/teys-1 (k <r), 
and 
o(R) = (MS(R) — MSE}/t,4:. 


The estimates are given in a more convenient form in 


Lemma 4.3. For a Type II model, without normality assumptions, (1) = 
A+ B+4+C (k <r), where 


A teM pM eg (tdre — 1) (1 — bre) nrg Wee/(t — 1) a, 
B 2teMr_.M (tbre — 1)1(1 — Bre) wr rgte/(t — 1)e, 
C = teMiM (tbe — 1)1(1 — bre) n—reee/(t — Ie, 
Ww, = (1+ D)e_m(R, be). 
It will be noted that the coefficients of ws, and « are equal to zero and it 
follows that Efo*(1)|’ = E(A*® + B’ + C*), and computation gives us 
varfo"(I)] = [u(1) — 30°(1)\/tr 
(4.2) + 2p Do (I + X)o%(I + ¥)/t(t — Vater (t — 1) xrtx—artror 


¥ocR- 
+ 4t;0°(1 + D)g_sfo"(R)/(t — 1)g)/t + Qtyteo®/P(t — 1)e (k <r), 


where XY is the set of numbers common to X and Y. 
Dhue > i, 


var(o*(R)] = [u(R) — 30°(R)]/te + 20°(R)/(t — 1)x + 40°0" (R)/trar(t — 1) 
+ 21/(t — 1)e + 1/te(trss — 1)Jo*/Oy. 


5. The Type III model. We assume that the ¢, expressions of the form p(J, a;) 
are a sample from a finite population, P(/), consisting of T, members with zero 





20 J. D. BANKIER 


mean and variance 
(5.1) o(1) = T,Myw (I, a)/(T — 1)1 


where 9%; = M., is an operator which averages a function over a; when the 
range of a; is from 1 to T;. We also assume that 


(5.2) Miu(I,ar)=0, j=l,---,k, 


and that random variables from different populations are independent. We draw 
our sample from P(J) by selecting ¢;, numbers at random from the set 1, --- , 
Ti,(j = 1,---, &). Lemma 4.1 holds under these conditions, so our problem 
reduces to finding the expected value of expressions of the form 


[Me_1(1 — M)m(R, - 2)J’. 


We will require the following lemmas. 
Lemma 5.1. For a Type III model 


Elw(1, ar) uC, br)} = (8 — 1/T)10°(1) 


where ba,,, = 1 if ai, = bi, and is zero otherwise. 
Lemma 5.2. For the Type III model 


E[Mx_1(1 — M)m(R, ae)f = (1 — 1/t),(1/t — 1/T)e-0(R)  (k S71). 


The application of the above lemmas leads to the following theorem which 
has also been stated by Bennett and Franklin [1]. 
TuHeoreM 5.1. For the Type III model 


E MS(1) = o + t(1 + D)g_1(1 — t/T) 2-10 (R)/te 


Equating mean squares to their expected values and solving we obtain 
Tueorem 5.2. For the Type III model 


o(1) = [(D — 1)a(l — t/T)e-sMS(R) — te-sMSE/Te-s\/teys-r (Kk S 1). 


Computations which are too long to be included in this paper lead to the 
conclusion that 


(5.3) Ele(I)f =A+B+C+D 

where 

A = A(t;/t)o*(1 + D)e-A(1 — 1/T)a-1(1 — t/T) 2-10°(R)/(t — Lal, 
Qtats{[(t — 1)/T? + (1 — 1/T)*Jean + (t — L)a/tr(tess — 1)Te—ho*/t, 
2 YS (l= t/T)xsvo"(I + X)o°( + ¥)/(t — Walt — 1) xrtzsr—er, 


X,¥CR-I 





T-WAY CROSSED CLASSIFICATION 


and 


te(t —1)eD = DO tesx(t — 1)e-r2(1 — t/T)x/ 


XcR-i 


TT -1(T - 20 - 3a xX D> DOS SZ 


S14X O14x Max ™I4x 

((¢ — 1)°7? — (3t — 1)(t — 1)T + © + Qbsdan 
+ (T — t)(T — t — 1)(8pbsm + Bruder) 
+ T(T — t){(t — 1)T — t — lbs Serdrn)r 
x [(38 — t — t/T — 1/T )bygiam + (T — :1)°(T — t — 1) 
(Byrom + Symign)/T + {[—-27? + (3t 4+ 1)T —t- YJ 
Se dahals X wll + ZX, Sere) 

ul + X, grax) eC + X, Arax) wl + X, mixx). 


The following formula which is used in the derivative of the expression for 
D may be of interest: 


(T(T — 1)(T — 2)(T — 3)]2F[u(R, be)u(R, ce) u(R, da)u(R, er)] 
= Dd Dd (Ail? + ArT + As)bsp5nm + (BT? + BT + Bs)bp5om 


tr ¢e ‘re ™e 


+ (CyT? + C2T + C3) bsmdgn + (D,T® + DzT? + DsT — 6)b,Sondnm)n 
-p(R, Sa)u(R, gx ul R, he)p(R, my) 


where 
A; = bbag — C(3, 4), Ao = C(2, 3) + C3, 4) — bee — Bae — Bbndac , 
A; 1 + 38. + 364 — C(1, 2) — C(2,3) + C(2, 4), By = b5.. — C(3, 4), 
B, = C(2,3) + C(3, 4) — bra — bee — Bbnabee , 
1 + 364 + 36. — C(1, 2) — C(2,3) + C(2, 4), Ci = bub — C(3, 4) 
C(2, 3) + C(3, 4) — bre — bea — Bbsebea , 
1 + 3b + 36.4 — C(1,2) — C(2, 3) + C(2, 4), Di = C(3, 4), 
= ((3,4) — C(2,3) — C(2, 4), D; = 2C(1, 2) — C(2, 3) + C(2, 4), 
C(1, 2) = bre + bea + Bre + Sra + See + Ste , 
C(2, 3) = Bebra + SrBee + SrdBie + Seabac « 
C(2,4) = brbae + SsdBee + SrcBea « 


—_ br Sedbite . 





22 J. D. BANKIER 


6. Acknowledgement. I am indebted to the referees for helpful comments and 
suggestions. 


REFERENCES 


{1] C. A. Bennett ann N. L. FRANKLIN, Statistical Analysis in Chemistry and the Chemical 
Industry, John Wiley & Sons, New York, 1954. 

[2] Jerome CorRNFIELD AND Joun W. Tuxey, ‘‘Average values of mean squares in fac 
torials,’’ Ann. Math. Siat., Vol. 27 (1956), pp. 907-949. 

{3} S. L. Crump, “The estimation of variance components in analysis of variance,’’ Bio- 
metrics Bulletin, Vol. 2 (1946), pp. 7-11. 

[4] H. B. Mann, Analysis and Design of Experiments, Dover Publications, New York, 1949. 





SECOND ORDER ROTATABLE DESIGNS IN FOUR OR MORE 
DIMENSIONS' 


By Norman R. Draper? 
University of North Carolina 


0. Introduction. The technique of fitting a response surface is one widely used 
(especially in the chemical industry) to aid in the statistical analysis of experi- 
mental work in which the “yield” of a product depends, in some unknown 
fashion, on one or more controllable variables. Before the details of such an 
analysis can be carried out, experiments must be performed at predetermined 
levels of the controllable factors, i.e., an experimental design must be selected 
prior to experimentation. Box and Hunter [2] suggested designs of a certain 
type, which they called rotatable, as being suitable for such experimentation. 
Such designs permit a response surface to be fitted easily and provide spherical 
information contours. A second order rotatable design aids the fitting of a 
second order (i.e., a quadratic) surface. 

Let us assume that the measurements of the factors have been coded, permit- 
ting the use of cartesian axes in k-dimensional space to describe an experimental 
design for k factors. Suppose, in an experimental investigation with k factors, 
N (not necessarily distinct) combinations of level are employed. Thus the 
group of N experiments which arises can be described by the N points in k di- 
mensions (21. , Zu, °** » Tew), U = 1,2, --- , N, where, in the uth experiment, 
factor ¢ is at level z;,. This set of points is said to form a rotatable arrangement 
of the second order in k factors if 


> Ziv = = +--+ = Dig, = WN, 


DL tis = = +++ wD rhe = 3D) tietin = BUN, ( » §), 


and all other sums of powers and products up to and including order four are 
zero, where all summations are over u = 1 to u = N. The point set is said to 
form a rotatable design of second order if the conditions above are satisfied and 
a certain matrix used in a consequent least squares estimation is non-singular. 
Box and Hunter [2] show that the necessary and sufficient condition for this to 


Received March 30, 1959. 

! This work was supported in part by the United States Air Force through the Air Force 
Office of Scientific Research of the Air Research and Development Command, under Con- 
tract No. AF 18(600)-83 and in part by a Bell Telephone Graduate Fellowship award to the 
author who acknowledges with gratitude his indebtedness. Reproduction in whole or part 
is permitted for any purpose of the United States Government. 

? Now with Imperial Chemical Industries (Plastics Division), Welwyn Garden City, 
Herts, England. 


23 





24 NORMAN R. DRAPER 


by the addition of points at the center (0, 0, 0) of the design. The inequality 
becomes an equality only when all the design points lie on a k-dimensional 
sphere. 

When presenting a rotatable design, it is customary to “scale” it. By this it 
is meant that the scale of the coded controllable variables is chosen in such a 
way that \, = 1. The reason for this is as follows. Given a second order design 
with a specified value of 4/)3 , there are an infinite number of values possible for 
dz > 0. Since these designs can be derived one from another merely by change of 
scale, we do not regard them as different. Thus the scaling condition \, = 1 
fixes a particular design and enables better comparison between two designs with 
different d4/d3’s. 

A previous paper by Bose and Draper [1] presented a new method for obtain- 
ing infinite classes of second order rotatable designs in three dimensions. In the 
present paper it is shown how the method previously employed may be used 
to obtain infinite classes of second order rotatable designs in dimensions higher 
than three by a suitable generation and combination of basic point sets. Also 
presented here is a method for adding to a second order rotatable design in 
(k — 1) dimensions in order to convert it to a second order design in k dimen- 
sions—a method useful in situations where it is desired to add an extra variable 
while making use of data already obtained. 


1. The generation of point sets in four or more dimensions. Let (x; , x2, «++ , Tx) 
be a point in k dimensions and let P; be the symmetric group of order k, that 
is, the group of all permutations of k elements. Thus we obtain k! points by 
operating upon (2, 22, -** , 2%) with the elements of P, . Let Rx be the trans- 
formation on k-space which takes the point (x, , 22, +--+, Z:, *** , 2) into the 
point (%, 2, °°, —2%, °**, t&). From a single point (1, z2, --- , 2), by 
an application of the k! elements of P, and/or the k transformations R» , 
(i = 1,2, ---, k), we can obtain a set of 2*k! points all of which are distinct, 
provided that x, , t2, --- , z are all non-zero and distinct. The set, which we 
shall call H(2, , v2, --- , 2%) and which consists of the points 


(+2, ’ +2Zi, ‘TS Se +2;,) 


where 7, #2, -*: , % run through every possible permutation of 1, 2, --- , k, 
satisfies the following conditions: 


> tin = (KR — 1)!2*(zi + ah + --- +h), 


Dd zie = (kK — 1) 2(at + zp + +++ + 28), 


k 
> tietin = (kK — 2)12 > x23, (t # j), 


t,jml 


and all odd sums of squares and products up to and including order four are 
zero, where i,j = 1, 2, --- , k; and u is summed from 1 through N, the total 





SECOND ORDER ROTATABLE DESIGNS 


number of points. Hence 


: 
Ex{H (2; , 22, +++, 2)) = (k — 2)12*[(k - 1D 2 ~ 3 DU xix), 
-_ ‘j= 
where Ex[H] is the excess of the point set H and was defined in [1]. For H as de- 
fined above, it is the amount ‘by which the pure fourth moment 7 ri, exceeds 
three times the mixed fourth moment >>, 27,2}. . The number of points in this 
set is too large for use in a design and it will be necessary to reduce the size 
of the set by making several of the z,; equal to one another and/or putting some 
of the z; equal to zero. 

We note that we could have begun this discussion by considering a set of 
only 2*k! points. The group of all permutations has, as a subgroup, the group 
of all the even permutations. The set obtained from a point (2, 22, +--+, 2%), 
when 2, 22, *** , 2 are all distinct, by application of the even permutations 
only is such that its moments are symmetrical in the way we desire. However, 
nothing will eventually be gained by this procedure, for, once we make two of 
the z,’s equal in the more general set, we shall obtain double the set we would 
have obtained from the set generated by use of even permutations alone. Thus, 
except in the most general case when 2 , 22, --* , Z% are all distinct, no addi- 
tional reduction will be achieved by our commencing with a half set. Note 
that when k > 3, a cyclic permutation of coordinates does not achieve symmetry. 

When there are k factors, the number of constants to be estimated for a 
second order model is 1 + k + k + k(k — 1)/2 or (Kk + 3k + 2)/2. For4 sk 
<= 7, we have the following table: 


i Tiel 8) aes aoe 6 7 
M+3k+2 | 15 21 28 36° 


To obtain a design consisting of a number of points equal to twice the number 
of constants to be estimated will be regarded here as a very desirable achievement. 
Unfortunately, because of the large number of moments to be balanced when 
selecting design points, such an achievement is rarely possible with the method 
of this paper. Thus some of the designs to be presented are useful only when a 
fairly large number of design points is allowable. In order to restrict the number 
of points in a generated set, we shall consider only cases where no more than 
three of x, , 2, -** , Ze are distinct. 

Consider the fraction of H(p, --- , p34, +: , 4937, *** , 7) which contains all 
possible points once and once only. Let p occur z times, g occur y times, and r 
occur z times, so that z + y + z = k. Let wv be the number of zeros if any of 
p, q, and r are zero. For example, if p ~ 0, q # 0, r = 0, then » = z. Hence 
the desired fraction of the whole set, which may be denoted by H(p’, q’, r*), 
contains 


k! 
zi y!z! 





26 NORMAN R. DRAPER 


points. Therefore, the set may be written as [z! y! z! 2’) H(p’, q’, r*), in nota- 
tion consistent with earlier usage (see [1]). 
This set has sums of powers and products as follows: 


2 (k — 1)! of 2 2 2 
Li tw = Awa = [zp + yq + 2r', 


4 (k — 1)! 4 4 4 4 
Li tis = aa 2*~" [rp* + yq' + 2r', 


ao» 2 
Lo tiv tis = E- ta 2 [x(x — 1)p' + yly — 1g 


+ 2(z — 1)r6 + 2ryp'¢° + 2yzqr° 
+ err’ p 1, 


and all other sums of powers and products up to and including order four are 
zero. Hence 


(k — <4 
tlyl2 
+ om — 3y + 2)q' + 2(k — 324 2)r' 


Ex {[z! y!z!27"' H(p’, 7’, r’)} = 2°” [x(k — 3a + 2)p* 


— 6ryp'q¢ — byzq’r’ — 6zzrr’p’| 
is the excess of this generated set of (k!/z! y! z!)2"” points. 

By giving specific values to p, q, r, z, y and z, we shall obtain the more useful 
sets of this type. In particular, we shall reject any set that contains more than 
48 points in four dimensions. If p, g and r are distinct and all are non-zero, there 
are 4! 2°/2 = 192 — = four Ceaentons. If P x 0, q #0, r= C, oan 
z=1l= 
three distinct rte for p, q and r, we must put r “ts and dow p and q to 
occur once only in order to maintain a reasonable number of points. This leads 
us to consider the er set 


S(p, q, 0") = [4(k — 2)!" H(p, q, 0, , 0) 


obtained by setting r = 0, r = y = 1. The set has 4k(k — 1) points and its 
excess is 4(k — 1)(p* + q‘) — 24p’q’. A short table of the number of points in 
this set follows: 

il ta ee a asant 

4k(k — 1) | 48 80 120 168 


S(p, q, 0°) by itself forms a rotatable arrangement if 


4(k — 1)(p'+ q') — 2p’ =0 


p/¢ = (38 + V9 — (k— 1)4/(k — 1). 





SECOND ORDER ROTATABLE DESIGNS 


Since k = 4, this is possible only when k = 4 and p’/q’ = 1. But if p’ = q’, 
the set can be reduced by half so that 


(S(p, q, 0°*) = [8(k — 2)]"H(p, p, 0, --- , 0), 


consisting of 2k(k — 1) points, forms a rotatable arrangement. The single de- 
sign which arises from this calculation is already known and is called the ex- 
tension of (25) by Gardiner, Grandage and Hader [3]. If we consider only 
S(p, p, 0°”) to begin with, this result is trivial, since the excess of S(p, p, 0°) 
is 4(k — 4)p* which can be zero only if k = 4, since p ¥ 0. 

Although it would be possible to use the 4k(k — 1) points of S(p, q, 0°”) in 
combination with other sets to form a rotatable design, we shall not do this 
because of the large number of points which would be involved. This leads us to 
mention one other point set that will not be used, given by z = 0, 
x= 1,y = k — 1, namely, {((k — 1)!]"H(p, ¢°”). This contains k2* points, 
too many for our purposes as the short table which follows shows. 


SS ee ee 
k2* 64 160 384 896 


By the usual methods, it may be shown that when 


p= (3+ V2k + 4)q 


and 
q = N/A(2+k + VW2k + 4), 


a rotatable design is obtained, a design already quoted by Gardiner, Grandage 
and Hader [3] as an extension of their design (23). 


Thus it becomes clear that the only point sets which are a fraction of 
H(p", q’, r°) and which obey all the required moment conditions except that 


TABLE I 
Selected point sets 


: 7 Value of N (4) 
Points of Set No. of pointe oe enininiaciieel 
hes heS i twm6ikhulT 


S: (+a, ta,--- , ta) 2 16 3% 64 128 
* 


4S, (k 2. one half replicate of 
5 only) 8; 


‘ 32 


S: (+c, 0,--- , 0) and 2k 12 2c* 
permutations 

S 0,+f,--- , +f) and a 3: 192 —(2k — 5)2*-1f* 
permutations 
+p,+p,0,--- ,0) 60 4 4(k — 4)p* 
and permutations 
+p, +p, +p, 9, 32 : 4(k — 2)(k — 7)p* 

«+» .0) 





28 NORMAN R. DRAPER 


their excess is not zero and which, in addition, contain what we shall consider 
a reasonable number of points are obtained by setting z = 0, q = 0 (i.e., let- 
ting the coordinates take two distinct values, one of which is zero) or setting 
z= y = 0 (i.e., allowing only one possible value for the coordinate). Proceeding 
in this way, we consider the five sets listed in Table I as suitable for combina- 
tion with one another for the formation of rotatable designs. We shall not use 
the fractional set notation here because of its unwieldiness. Other possible sets 
are neglected on the grounds that they contain too many points for our purposes. 
Several features of the sets above are immediately noticeable: 
S; and S; have negative excess. 
Ss has negative excess if k < 7, positive excess if k > 7 and zero excess if 
k = 7. (Thus if k = 7, the points of S; form a rotatable arrangement; the 
design thus formable will be derived later). 
S: has positive excess. 
S, has positive excess if k > 4. 
These facts determine the combinations of sets we shall choose to form several 


infinite classes of rotatable designs analogous to those formed in three dimen- 
sions. 


2. Infinite classes of second order designs in four or more dimensions. The 
generated sets may now be combined in the same way as was done previously in 
the three dimensional case [1]. Six of the more useful combinations are presented 
in Table II. All of the previously known designs (apart from the two mentioned 
separately in Section 1) occur as special cases of the classes in the table. 


3. A method of constructing a second order design in k dimensions using a 
second order design in (k — 1) dimensions. Select a second order rotatable 
arrangement of points in (k — 1) dimensions to which the scaling condition 
A» = 1 has not yet been applied. Then we shall have (say) N’ points 


(21u ’ T2u > Te-1,u) l s u s N’, 


for which 


> ti = AN’, 


> Zs = 3>. Tiuliu = 3C, say, (tj), 1,3 = 1,2,°--,(k— 1) 


and all odd sums of powers and products up to and including order four are 
zero. Consider all the points obtained by adding a further coordinate x, = +b 
to the coordinates of the (k — 1) dimensional points. Thus we obtain a point 
set in k dimensions 


(3.1) (Zin » Tau, °*°* » Za iu, + 5), 


consisting of 2N’ points. 





‘(Ayuo ¢ S ¥) 'g Jo opwoydes jyey B 40) § = © Jog :oV0N 


®% « T 
df 2% 4x0 bs 2 | 
af rh (le-a(t-a) Mox)a/n7 = 4 8 


kad ‘89 ,xs9d 


"at vt 
af th bee t-x)9%2)/ 87 =¥ 


ko %e ‘4 x = 's 


fay at 


ae 2 ia 
ah ((knex'o) 2 t-a)Q)/N7 = 4 yh Oey 2D * PVT 12 


25 Tt, 


42/16 = ‘Sa = 
ah axl Moxley un =3 


T 


é x= ‘8 


a = “8 ‘Dart 
ah (A mex'n), et den 22 


f te T 


"a8 = at 
af h (qeyeh mde = 


X= ‘> 


impale 


sotzea 
(T*#0d 


tej} amered 
spuadap et 
: oo 384Tz ZO SS¥T> YOTGA uo 
20 suse Sle} UT OF jus OF381 Jajowered 
} santea Jajewered puosesg 383133 JO aBuey oh. oie ae 


a3 BUT ps00> 
quyod uB} seq squyod jo zequny 


§ JO aauottdez 
try BZuyansse) 


subisap apqnvypjo4s fo sassv),) 


I] W1aVv.L 





30 NORMAN R. DRAPER 
Consider the point sets 

(3.2) (0,0, --- ,0, + p) 

(3.3) (0,0, ---,0,+q). 


Then the values of p and g may be so adjusted that the three sets (3.1), (3.2) 
and (3.3), together with any center points which may be added, form a second 
order rotatable design in k dimensions. The number of points in the derived 
design is N = 2N’+4+%. 

This may be shown as follows. The addition of the extra coordinate x, to the 
(k — 1) dimensional point set contributes only to >> zi, , >> zt, and >, Ziti ; 


i = 1,2, ---, (k — 1). It is clear that moments which were previously zero 
remain zero and that odd sums of powers and products involving zx, are zero 
because 2, is constant (-tb). Thus these sums of powers and products will be 
zero either for each set of N’ points separately or else for the two sets combined. 
Thus for all of the N = 2N’ + 4 + np points, 


2 
> tw = 2A, 
as 
a 
Lo tin = 
uv 
2 2 ’ 
> Zivlie = 2C, 
v7 
2 rey o/.2 2 
> zi, = 2N’D? + 2p? + 4’), 
u 
7 Dew = 2N’b* + 2(p* oe q‘), 
u 
2 2 2 
> Zielhn = 2b A, 
v7 


where u is summed from 1 through N, and all other sums of powers and products 
up to and including order four are zero. 


There will be symmetry in the moments up to fourth order provided that 
p+q¢+N'b =A, 
p+qt+N'd' = 3C, 


9 


Ab = C. 


Thus, if these conditions can be satisfied by choice of p, g and b, the N points 
will automatically form a second order rotatable arrangement, since the equa- 
tions above imply that yore thy = 3> 41 tintin = 6C fori ¥ j and i, j= 





SECOND ORDEFP ROTATABLE DESIGNS 


1, 2, --- , k. From (3.5) 


= C/A, 
(3.6) p+q = (A*® — N'C)/A, 
p'+q° = C(3A’ — N’C)/A’ 
Solving the simultaneous equations of (3.6) we obtain 
(3.7) p,q = ((A® — NC)  V2C(BA? — NC) — (A* — N'C)/2A. 


We now apply the scaling condition \, = 1, which gives 2A = N or A = N/2, 
and this must be substituted into the expressions for p’ and ¢’ above. Hence, 


(3.8) pg = [(N* — 4N’C) + V8C(QN* — 4N'C) — (N? — 4N'C)3//8A. 
In order that both p’ and q’ should be real and non-negative, i.e., in order that 


a new design should be obtainable, the original design must satisfy the condi- 
tion 
(3.9) 22¢21, 


where ¢ = (A’® — N’C)*/C(3A’ — N’C). It is necessary to determine in the 
usual way for individual cases whether or not the addition of center points is 
required. 

As an illustration of the method we now derive a second order design in four 
dimensions from a second order design in three dimensions. Consider the well 
known cube plus octahedron arrangement in three dimensions with no center 
points, given by 


(+a, +a, +a), 
(+c, 0, 0), 
. = +c, 0), 
i & 0, +c), 
In the notation of this section, 
13C = 8a‘ + 2c = Aa‘ = 3(C). 
Hence 
ci = 8a° = (2°"a)* = (1.682a)*, 
so that 
= 8a‘, 
(3.11) A = 8a’ + 2c? = 4(2 + v/2)a’, and 
N’ = 14. 
Thus @ = 1.55, and use of the method is possible. 





NORMAN R. DRAPER 


Consider the point set in four dimensions given by 
(+a, +a, +a, +b), 
(+c, F 0, +b), 
(’ @ 0, +b), 
( 0, +e, +b), 
. = 0, +p), 
( 0, 0, +9). 


These points form a second order rotatable arrangement if the solutions for p* 
and q which result from substitution of (3.11) into (3.7) are real and non- 
negative, which they are, since 2 2 @ 2 1. Performing the calculation, we find 
that p’ = 4.196400 a’, g = 1.259446 a’, so that p = 2.049 a, g = 1.122 a. We 
recall that c = 1.682 a, while b = 1/C/A = 0.765 a. Thus we have a second 
order rotatable arrangement in four dimensions with 32 points given by 

( +a, +a, +a, +0.765 a), 

(+1.682 a, 0, 0, +0.765 a), 

+1.682 a, +0.765 a), 

+0.765 a), 

+2.049 a), 

+1.122 a), 
where a is to be determined by application of the scaling condition \, = 1. The 
separate sets which comprise the arrangement have radii +~/3a? + b?, ~/c? + Bb? 
p and q, that is av/3.585, av/3.414, p and gq, or 1.189 a, 1.848 a, 2.049 a and 


1.122 a. Thus the arrangement is not spherical, and it can be used as a design 
without addition of center points. However, 


hs/AE = 16a°/N = .02144N, 


where N = 32 + no. Hence \4/\3 = .686, when no = 0. This is greater than 
the singular value of .667 (for k = 4), but not very much so; it would there- 
fore be preferable to use a few center points with this design. When, for example, 
no = 4, 4/A3 = .772. After deciding on the number of center points to be used, 
we can determine the value of a which specifies the design points from the 
scaling condition. This gives 


> 


a = (2 — v/2)N/16 = .03661 N. 


4. Acknowledgement. I wish to acknowledge with gratitude the guidance and 
encouragement of Dr R.C. Bose in the preparation of this paper. 





SECOND ORDER ROTATABLE DESIGNS 


REFERENCES 
Bose anpD Norman R. Draper, ‘‘Second over rotatable designs in three dimen- 
sions’’, Ann. Math. Stat., Vol. 30 (1959), pp. 1097-1112. 


P. Box anv J. 8S. Hunter, ‘‘“Multi-factor experimental designs for exploring re 
sponse surfaces’’, Ann. Math. Stat., Vol. 28 (1957), pp. 195-241. 
Garpiner, A. H. E. GRanpace anv R. J. Haver, 


‘Some third order rotatable 
designs’’, Institute of Statistics Mimeo Series No. 149, Raleigh, North Carolina, 
1956. 





A MATRIX SUBSTITUTION METHOD OF CONSTRUCTING 
PARTIALLY BALANCED DESIGNS' 


By B. V. San’ 
University of Bombay 


1. Introduction and summary. Vartak [6] has considered the construction of 
experimental designs with the help of Kronecker products of matrices. The 
method is equivalent to the replacement of two elements, 0 and 1, by two 
matrices. A generalisation of the above idea is given by the author [4], using 
only the incidence matrices of balanced incomplete block (BIB) designs for 
substitution. In the present paper the same idea is extended to the case where 
substitution is by the incidence matrices of partially balanced incomplete blocs 
(PBIB) designs and factorial experiments. In Sections 2 and 3 some ideas re- 
garding canonical vectors and PBIB designs are introduced. Section 4 deals 
with associable designs and their properties. In Section 5 balanced matrices are 
defined and in Section 6 a method is given for constructing designs by substitut- 
ing for the elements of a balanced matrix, the incidence matrices of associable 
designs. The application of this method to the construction of factorial experi- 
ments is considered in Section 7. 


2. A canonical matrix. Let N = [n;;| be the incidence matrix of a design, where 
nj; is the number of times the ith treatment occurs in the jth block. Let the ith 
treatment be replicated r; times and the jth block have k; plots. The C-matrix 
of the design is defined by 


(2.1) C = diag (r,, 2, -** , T) — N diag (ki, ke, --- , k&)N’, 


where diag (a; , a2, --* , @,) stands for a diagonal matrix with diagonal elements 
equal to a, , @, «-* , @ respectively. 

If 1 is a vector such that I'l = 1 and Cl = al, then the vector 1 is called a 
canonical vector of the design. If 1,, h, --- , 1, form a set of »v mutually orthog- 
onal canonical vectors, then the v x v matrix L whose ith column is 1; will be 
called a canonical matrix of the design. 

The importance of a canonical matrix is quite obvious, since knowledge of it 
enables one to analyse even the most complicated design. For a design in which 
n=m=-:: =r, andk = k = --- = kj, the same canonical matrix L re- 
duces both L’CL and L’NN’L to the diagonal form. Hence the properties of a 
canonical matrix of such a design can be studied with reference to the matrix 
NN’. 

Received December 10, 1958; revised April 27, 1959. 


! This work was supported by a Research Training Scholarship of the Government of 
India. 


? Present address, Iowa State University. 
34 





METHOD OF CONSTRUCTING PBIB’S 35 


In this paper we shall consider only those designs for which n,; = | or 0, = 
To = ++ tr =randk = k= --- = k = k. Above conditions are satisfied 
in most of the designs used in practice. 

The following matrix theorem (Thrall and Tornheim ([5], p. 189) will be 
useful in later sections. 

Tueorem 2.1. Let A;, Ar, --- , Ay be a set of real symmetric matrices such 
that every pair commute. Then there exi»/s an orthogonal matrix L such that L'A\L = 
D,; , where each D; is diagonal. 


3. Canonical matrix of PBIB designs. A PBIB design with m associate classes 
has been defined b; Bose and Shimamoto [1] substantially as follows: A PBIB 
design with m associate classes m 2 1 is an arrangement of v treatments in b 
blocks of k plots each such that 

(i) Each of the v treatments is replicated exactly r times and no treatment 
appears more than once in a block. 

(ii) There exists a relationship of association between every pair of the treat- 
ments satisfying the following conditions: 

(a) Any two treatments are either first, second, --- , or mth associates. 

(b) Each treatment has exactly n,; ith associates (i = 1, 2, --- , m). 

(c) Given any two treatments which are ith associates, the number of 
treatments that are both jth associates of the first and kth associates of the 
second is pj, and is independent of the pair of treatments with which we start. 
Also Diz = Pej . 

(iii) Two treatments which are ith associates occur together in exactly A, 
blocks. 

Further we shall define each treatment to be its own Oth associate and Oth 
associate of no other treatment. Consistently we define 


(3.1) do = r, m = 1, Pet os 6,:, , Pos = pro = 6.1, 


where 6,; is the Kronecker delta which is defined for all pairs of natural numbers 
i,j asé; = 1, if i = j; and 6,; = 0, if t ¥ 7. 

Each of the associate classes of a PBIB design can define the corresponding 
association matrix B, = [Bi,;] (t = 1, 2, ---, m), where Bi; = 1, if the ith and 
jth treatments are the tth associates and Bi; = 0 otherwise. Now, from the 
definition, it can be shown that for a PBIB design with incidence matrix N, 


(3.2) NN’ = >\B,. 


It should be noted that the results of this and the next two sections lean almost 
entirely on the fact that NN’ = }>—\B, or on its canonical equivalents, such as 
NN; = >-uB of Theorem 4.1. 

For the sake of brevity, we shall often denote the design by incidence matrix, 
say N., and its parameters by v(c), b(c), r(c), k(c), m(c), Ax(c), nic), pis(c), 
and the association matrices of the design by B,(c) t = 0, 1, --- , m(c). In two 
PBIB designs N, and N;, if v(1) = v(2), m(1) = m(2) and B,(1) = B,(2), 





36 B. V. SHAH 


{t = 0, 1, --- , m(1)}; then the ith and jth treatments of N, are pth associates 
if and only if the ith and jth treatments of Nz and pth associates. Hence n,(1) 
= n(2) and py(1) = pj,(2). Consequently, it follows that the equality of 
association matrices implies the equality of the secondary parameters n; and p}, 
but the converse is not true in general. This point is important, since we shall 
be concerned with the equivality of association matrices in Theorems 3.1, 4.1 
and Definition 5.2. 

The following four designs with given parameters and a suitable and appro- 
priate association scheme can be considered as PBIB designs. (The parameters 
n; and pj, naturally depend upon the association scheme. ) 

(a) A null design with the incidence matrix O(v, b)(av x b matrix with all 
the elements equal to zero). Parameters: v, b,r = k = \y = Ae = «+: = An = O. 

(b) A randomized block design with the incidence matrix E(v, b)(av x b 
matrix with all the elements equal to unity). Parameters: v, b, r = b, k = », 
A eA = °° ad, = DO. 

(c) A BIB design of v* treatments, each replicated r* times in b* blocks of k* 
plots each, such that each pair of treatments occurs together in exactly \* blocks. 
Parameters: »v = v*,b = b*,k = k*,X; = AX = -°-> = ry = A*. 

(d) An identity design with the incidence matrix o(v)(a v x v Identity 
matrix). Parameters: v = b,r = k= 1,\. = ye = °°: = Aw = O. 

Tueorem 3.1. Jf there are s PBIB designs N,, N:, ---, N, such that 
v(1) = v(2) = --- = v(s) = v, m(1) = +--+ = m(s) = m and B,(1) = 
B,(2) = --- = B,(s) = B, fort = 1, 2, --- , m, then there exists an orthogonal 
matrix L, which is a canonical matrix for each of the s designs. 

Proor: From the definition of a PBIB design it can be shown that 


(3.3) B.B, = >> pisB;. 


i=( 


Since pix = pi; , it follows that 
(3.4) B.B, = B.B, ’ 


Hence by Theorem 2.1, there exists an orthogonal matrix L such that L’B,L 
is diagonal for i = 0, 1, --- , m. Since 


(3.5) NN. = > aAi(c)B,, 

t=O 
it follows that L’N.NL is diagonal for all c = 1, 2, --+ , s. Hence L is a canoni- 
cal matrix for each of the s designs. 


4. Associable designs. 

DEFINITION 4.1. The s designs N, , Nz, --- , N,, each in v treatments and b 
blocks, will be cailed associable designs, if there exists an orthogonal matrix L, 
such that LUNN-L is diagonal for all 7, 7 = 1, 2, --- , s. The matrix L will be 
called a canonical matrix of association. 





METHOD OF CONSTRUCTING PBIB’S 37 


LemMa 4.1. Any design N is associable with itself (N) and its complementary 
design |E(v, b) — N}. 

Lemma 4.2. Any design 1s associable with a null design, or a randomised block 
design, provided the numbers of treatments and blocks are the same for the different 
designs. 

Lemma 4.3. The identity design is associable with any design whose incidence 
matrix is a symmetric v * v matriz. 

Tueorem 4.1. If two PBIB designs N, and N, are such that 

(i) v(1) = v(2) = v; b(1) = b(2) = bs m(1) = m(2) = m; B,(1) = B,(2) = 
B,,?@ = 1,2, ---,m:; 

(ii) and if the b ‘double blocks’ formed by amalgamating jth block of N, with 
the jth block of N, are such that a treatment i of N, and a treatment j of Nz occur 
together in exactly p,(1, 2) doubie blocks if and only if the ith and jth treatments 
are the pth associates in either of the designs; 
then N, and N, are associable. 

Proor. Let NN; = (m,j). Then ms = >ona(1)-na(2) = the number of 
double blocks in which the treatment i of N, and the treatment of j of N, occur 
together = u,(1, 2), but u,(1, 2) also appears in the ith row and jth column of 
> »#y(1,2) B,. Hence N,N; = >-,u,(1, 2)B,, and similarly for N,N; , whence 
the result on applying an argument similar to that leading from (3.5) to the 
conclusion of Theorem 3.1. 

It should be noted that the conditions of Theorem 4.1 are not necessary, but 
in some of the later results we shall assume that these sufficient conditions are 


satisfied and then the parameters y,(1, 2) will be called the parameters of asso- 
ciation. Further when a PBIB is associable with itself, its complementary de- 
sign, a null design, or a randomised block design, the sufficient conditions 
given in Theorem 4.1 are satisfied. 


5. Balanced matrices. 

DeFtnition 5.1. If there exist s(u x w) incidence matrices N} . N:, ree, N°, 
such that )-!.,N) = E(u, w), and if there exists an orthogonal matrix L* such 
that L* (NIN; + NIN? )L* is diagonal for all i, j = 1, 2, ---, 8, then the 
matrix A = )-{ iN, will be called a canonically balanced matrix in s integers 
ca 84's 

There is great resemblance between the conditions imposed on N, in Defini- 
tion 4.1 and N? in Definition 5.1. The condition, that all L” (NIN?’ + NUN?’ )L* 
are diagonal, implies that all the designs with incidence matrices Ni and 
N: + Nj have the same canonical matrix L*. On the other hand, the condition 
that all L’(N.Nj)L are diagonal is slightly stronger and implies that not only 
all the designs with incidence matrices N; and N,; + N, have the same canonical 
matrix L but also each N.N; is symmetric or N.N; = NN.. 

Derinirion 5.2. If there exist s PBIB designs with s(u x w) incidence 
matrices N} , Nz, ---, Ni such that }-_, NJ = E(u, w) and such that the 
designs N? and N? + Nj(i > j = 1,2, ---, 8) all have the same association 





38 B. V. SHAH 


matrices B,,t = 1, 2, --- , m, then the matrix A = 2 ut iN? will be called a 
partially balanced matrix. 
Lemma 5.1. A partially balanced matrix is also a canonically balanced matriz. 
Proor. The condition N: = E(u, w) is satisfied in both Definitions 5.1 
and 5.2. Now, if N* and N? + N; are PBIB designs, then from (3.2), we have 


(5.1) NIN = > AT(i)B, 


t=O 


and 

(5.2) (Ni + Nj)(N! + Nj)’ = > wi (j)Be. 
(=O 

Hence 


(5.3) NIN; + NUN = 2 (ur (ij) — (i) — NG)IB., 

t 
whence the result on applying an argument similar to that leading from (3.5) 
to the conclusion of Theorem 3.1. 

Lemma 5.2. If 8: integers are divided in 8, groups each group containing at least 
one integer, and if all the integers of a group are replaced by an indentical integer, 
then a canonically balanced matrix in 8 integers will be reduced to a canonically 
balanced matrix in 8, integers. A similar result holds also for a partially balanced 
matrix. 

Derinition 5.3. Let A = [a;;|(¢ = 1,2, --+ ,u;j = 1,2, -++ , w) be a matrix 
whose elements a,;; take any one of the s values 1, 2, --- , s. The matrix A will 
be called a partially balanced matrix, if it satisfies the following conditions: 

(a) The number of times the integer c occurs in a row is the same for all the 
rows and is equal to a(c), say. 

(b) The number of times the integer c occurs in any column is the same for 
all the columns and is equal to 8(c), say. 

(c) The rows have an association scheme similar to the treatments of a PBIB 
design with parameters n? and pi; (i, j, k = 0, 1, --- , hk). The number of times 
the combination of integers i and “| occur in any pair of rows, which are 
the ith associates, is the name for all the pairs of rows, which are the ith asso- 


ciates, and is equal to y,;(c, d). (The combinations [S] and [?] are considered 


to be identical. ) 
c 


d 


| or [| will be used in Theorem 6.2 to “mesh” so 


to speak with the “double block” notion of Theorem 4.1; since the substitution 


The pair of integers | 





METHOD OF CONSTRUCTING PBIB’S 39 


of integers c and d with matrices N, and N, will form double blocks of the form 
N, N, 
hi - od 3 
The Definitions 5.2 and 5.3 are equivalent. This follows, since, starting with 
Definitions 5.3, if we replace the integer c of the matrix A by 1 and the reimain- 
ing s — 1 integers by 0, and if we call the resultant matrix N?(c = 1,2, --- , 8), 


then from Definition 5.3, it can be shown that N* and N* + NY are PBIB de- 
signs with the same association scheme. 


6. Construction of designs. The operator ‘X’ will denote the Kronecker prod- 
uct of matrices defined by 


4;,B 42B eee 4B | 
441B anB :--:: a1. B | 


(6.1) AXB=\a,|)XB= 


a.,B a.2B ee u _B 


TueoremM 6.1. If there exists a canonically balanced matrix A in 8 integers 
1, 2, «++ , 8 given by >-1_, aN? with the corresponding orthogonal matrix L*, if 
there exist 8 mutually associable designs with incidence matrices N, , N:, --+ , N, 
with the canonical matrix of association equal to L, and if the integer c in A is re- 
placed by the matrix N.(c = 1, 2, «++ , 8), then the matriz A will be converted into 
an incidence matrix N of a design whose canonical matrix is L* X L. 

Proor. From the method of construction, it follows that 


(6.2) N = >N! XN. 


tl 


Since N.N; = N,N. , NN’ can be expressed as 
(6.3) NN = z (NIN?) X NUN; + 2, (NIN? + NINT’) X NWN). 

= ‘>= 
Now each of the terms on the right hand side of the equation (6.3) will be re- 
duced to the diagonal form by the orthogonal transformation (L* X L) by 
virtue of Definitions 4.1 and 5.1. Hence the matrix (L* X L)’NN’(L* X L) is 
diagonal. This proves the theorem. 

Tueorem 6.2. Let there be s PBIB designs with incidence matrices N, , N:, --~ , 
N, . Now let the parameters of the cth design be v’, b’, r(c), k(c), (ec), n, , and 
pi; (i,j, k = 0, 1, ---, m). Let the cth design be associable with the dth design, 
satisfying the sufficient conditions given in Theorem 4.1, with parameters of asso- 
ciation equal to y,(c, d)(i = 0,1, ---, m;e,d = 1, 2, ---, 8). Let there bea 
u * w partially balanced matriz A in 8 integers with parameters a(c). 8(c), my: 
pis , vile, d)(c, d = 1,2, «++, 8; i,j,k = 0, 3, --+, h) as given in Definition 
5.3. Now, if we replace the integer s in the matrix A by the matriz Nc = 1, 2, 

- , 8), then the matrix A will be converted into an incidence matrix, say N, of a 
PBIB with (h + 1)(m + 1) — * non-zero associate classes and the following 
parameters: 





B. V. SHAH 


, 


uv’, 


wh’, 


e 


Zz. y(c)r(c), 


c=] 


> B(c)k(e). 


c=1 


Denoting associate classes by two subscripts (ij)(i = 0, 1, 
- , m), the other parameters are given by 


a , 
Nj = 1N;, 


ij = a* j/ 
Pik ul PiuPet , 
s 


doe 7 a(c)A,(c), 


c=l 


Ai; 7 vile, d)y;(ce, d), 


ed=1 
fa 8-3 af QE -os ee, 


Proor: The expressions for v, b, r, k are obvious and need no proof. The others 
can be proved as follows: 

From the method of construction, it can be seen that the wv’ rows of the new 
matrix N can be grouped in u groups corresponding to the u rows of the matrix 
A. Hence, the rows of N can be indexed in the natural way by the double index 
(4, j),t = 1,2, ---,u; 7 = 1,2, ---, 0’. The treatments (7, 7) and (7’, 7’) 
will be called (gt)th associates if the ith and 7’th rows of A are qth associates 
and the jth and j’th treatments of any of the designs N, are tth associates, qg 
= 0,1, ---,h;t = 0,1, --- , m. The class (00) is the Oth class as in (3.1). So 
we have a PBIB design with (h + 1)(m + 1) — 1 non-zero associate classes. 
The expressions for n,; and pit... follow from the above association scheme. 

In any row the integer ¢ occurs a(c) times; therefore the matrix N, also oc- 
curs a(c) times. In the design with the incidence matrix N, the tth associate 
treatments occur together exactly \,(c) times; hence the (0¢)th associates occur 
together in exactly }- a(c)A,(c) blocks. Thus 


a 
(6.6) oe = ¥ a(c)Ad(c), 


c= 


Similarly in any pair of rows which are ith associates the combinations d 


d ; rm 
and l?] occur exactly y,(c, d) times. The /th treatment of N, and the kth treat- 


ment of N, occur together u;(c, d) times, if the kth and /th treatments are jth 
associates. Hence the (7j)th associate treatments occur together in exactly 


> vile, d)yu;(c, d) blocks. Thus 





METHOD OF CONSTRUCTING PBIB’S 


(6.7 vile, d)u,(c, d), 
> 1 


This proves Theorem 6.2. 


7. Application to factorial experiments. 

DEFINITION 7.1. If }, &, -+- , t, are the treatment effects, then the contrast 
>} a(t; is called a normalised contrast, if }“} a; = 0 and )oj ai = 1. 

DertnitTI0oNn 7.2. A factorial experiment will be called a balanced factorial 
experiment (BFE), if the following conditions are satisfied. 

(i) Each of the treatments is replicated exactly r times. 

(it) Each of the blocks has the same size k. 

(iii) Estimates of the contrasts belonging to the different interactions are 
uncorrelated with each other. 

(iv) For each of the interactions, al! the normalised contrasts belonging to 
the same interaction are estimated with the same variance. 

DeFINiTION 7.3. In a factorial experiment, if the conditions (i), (ii) and (iii) 
given in Definition 7.2 are satisfied and the condition (iv) is not satisfied for 
some of the interactions, then the experiment will be called a partially balanced 
factorial experiment (PBFE). 

TueoremM 7.1. A BFE in m factors, F;, F2, --- , Fm al 8, 8, *** , &m levels 
respectively, is a PBIB design with an association scheme given as follows: the two 
treatments (2,%2:-+Im) and (yyy2-+*Ym) (where y; , y; represent levels of the factor 
F;) are (pipe: +-Pm)th associates, where p; = 1, if x = ys; and p, = 0, if a) Fy. 
Conversely a PBIB design with the above association scheme is a BFE. 

The proof of Theorem 7.1 follows from Theorem 6.1 of [3] on substituting 
m = Mm = --+ = m = landh = m. 

THeoreM 7.2. Let there be s BFE’s with incidence matrices N,, N:, --- , N, 
each in m factors F, , Fz, -++ , Fm at &, 8, *** , &m levels respectively. Let these 
s BFE’s be associable PBIB designs satisfying the sufficient conditions given in 
Theorem 4.1. Now, tf there exists a partially balanced matrix A in 8 integers 1, 2, 

, 8 with an association scheme for rows equivalent to that for the treatments of a 
PBIB design which is a BFE in n factors Finis, Fmsa, +++ 5 Fin Ot 8mii, Smet, 
 , Sman levels respectively, then by substituting the matrix WN, for the integer i in 
the matrix A, the matrix A will be converted into an incidence matrix of a BFE in 
(m + n) factors. 

The proof of the above theorem is obvious from Theorems 6.2 and 7.1. 

As an application of Theorem 7.2, consider the following example: 

EXAMPLE 7.1. Let us take m = 2. n = 1, 8; = 8 = 2, % = 3. Let the treat- 
ments of 2” design be denoted by (uv, 01, 10, 11 in order. Then confounding the 
interaction between two factors F; and F,, we get a BFE with the incidence 
matrix 


E 4 

a 0 

(7.1) N, = 0 e 
Lt 





42 B. V. SHAH 


Let the balanced matrix A in 3 integers and 3 rows be given by 
(7.2) A= 


Now, putting N, = N, = E(4, 2) — N, and substituting for 7 in A, the matrix 
N,, i = 1, 2, 3, we obtain a BFE in 3 X 2° in 6 blocks of 6 plots each. 
Alternatively, if we take N, = N(4, 2) — N, and N; = O(4, 2), we obtain a 
BFE in 6 blocks of 4 plots each. The first design is identical with the plan num- 
ber 6.9 of Cochran and Cox [2]. 

In Theorem 7.2, a BFE was constructed by exact analogy with Theorem 6.2, 
similarly, a PBFE can be constructed by exact analogy with Theorem 6.1. The 
necessary and sufficient condition is that the column vectors of the matrices L 
and L* given in Theorem 6.1 must form normalised treatment contrasts belong- 
ing to various interactions. The following example will illustrate the method. 

EXAMPLE 7.2. Let N] be the incidence matrix of a 3° factorial experiment in 
3 blocks of 3 plots each, obtained by confounding only two degrees of freedom 
of the interaction F,F,;. Let Nz? and Nj be the matrices formed by cyclically 
permuting the columns of N: . Then a canonically balanced matrix B in three 
integers is given by 


(7.3) B= N: + 2N; + 3N;. 


Take N, as a 2° BFE as given in (7.1). N,; = E(4, 2) — N, and N,; = 0(4, 2). 
Then on substituting N; for iin B we obrain a PBFE in 32’ in 6 blocks of 12 
plots each. 

The methods given may be of considerable importance for constructing con- 
founded assymmetrical factorial designs in a large number of factors. Some of the 
confounded factorial designs already known and many more can be constructed 
by these methods. 


8. Acknowledgement. The author is grateful to Prof. M. C. Chakrabarti for 
his help and guidance. 
REFERENCES 

{1} R. C. Bosse anv T. SaimamorTo, ‘Classification and analysis of partially balanced in- 
complete block designs with two associate classes,’’ J. Amer. Stat. Assn., Vol. 
47 (1952), pp. 151-184. 

{2} Wituram G. CocHran AND GERTRUDE M. Cox, Experimental Designs, 2nd ed., John 
Wiley and Sons, New York, 1957. 

[3] B. V. Suan, ‘On balancing in factorial experiments,’’ Ann. Math. Stat., Vol. 29 (1948), 
pp. 766-779. 

[4] B. V. Suan, “On a generalisation of the Kronecker product designs,’’ Ann. Math. Stat., 
Vol. 30 (1959), pp. 48-54. 

[5] Ropert M. Turacu anp Leonarp TorNnuHEIM, Vector Spaces and Matrices, John Wiley 
and Sons, New York, 1957. 

[6] Manowar Naguar Vartak, “On application of Kronecker product of matrices to sta- 
tistical designs,’’ Ann. Math. Stat., Vol. 26 (1955), pp. 420-438. 





SOME APPROXIMATIONS TO THE BINOMIAL DISTRIBUTION 
FUNCTION 


By R. R. Banapur 
Indian Statistical Institute, Calcutta 
1. Summary. Let p be given, 0 < p < 1. Let n and k be positive integers 
such that np S k Sn, and let B,(k) = oP, C\am, where gq = 1 — p. 
It is shown that 


B,(k) = | () py | gF(n + 1,1;k +1; p), 


where F is the hypergeometric function. This representation seems useful for 
numerical and theoretical investigations of small tail probabilities. The repre- 
sentation yields, in particular, the result that, with A,(k) = [ (neta | 
[((k + 1)/(k + 1 — (n + 1)p)], we have 1 S A,(k)/B,(k) S 1 + 2’, where 
z= (k — np)/ (npq)'. Next, let N,(k) denote the normal approximation to 
B,(k), and let C,(k) = (2 + Vq/np)V/2e exp [z’/2]. It is shown that 


(A,N,C,)/B, —] 


as n — ©, provided only that k varies with n so that z 2 0 for each n. It fol- 
lows hence that A,/B, — 1 if and only if r— ~ (i.e. B, — 0). It also follows 
that N,/B, — 1 if and only if A,C, — 1. This last condition reduces to 

z = o(n'®) 
for certain values of p, but is weaker for other values; in particular, there are 
values of p for which N,/B, can tend to one without even the requirement 
that k/n tend to p. 


2. Introduction. Let p be given, 0 < p < 1, and let n and k be positive in- 
tegers such that 
(1) npskesn. 
Define 


(2) y= dh (") pg” 


rk 


where g = 1 — p.’ The following is an apparently new representation of B,(k): 


Received March 26, 1959. 
1 Only upper tail probabilities are discussed in the paper. This involves no loss of gen- 
erality, since p is arbitrary. 


43 





R. R. BAHADUR 


B,(k) = | (7) pia | aF(n + 1,1;k + 1;p) 


(n + 1) (n+1)(n+2) . , 
k++) pt+ (k + 1k + 2) p +--+ adinf. 

To establish (3), consider an unlimited sequence of independent Bernoulli 
trials each with success probability p. Let S denote the total number of successes 
in the first nm trials, and let T denote the minimum number of trials required in 
order to obtain a total of n — k + 1 failures. Then the event {S 2 k} is iden- 
tical with the event {7 2 n + 1}. Hence P(S 2 k) = P(T 2 n+ 1), and 
(3) now follows by referring to the probability distributions of S and T. 

Thus (3) expresses a relation between the binomial and negative binomial 
distributions. Another relation (which, however, is not used in this paper) be- 
tween these distributions is the following. Let U denote the minimum number 
of trials required to obtain a ene of k successes. Then {S = k} is identical 
with {U’) < n}. Hence P(S = P(U s n), and this can be written as 


=k 
ae (m — 1) 1 (m — 1)(m — 2) 1 
—* it ") 2 “|¢ f ui (n — 1) ;* (n — 1)(n — 2) ¢ 


(m — 1)---(2)(1) 1 |; 


(4) F=1+ 


(5) 


Fe 1s 7 


where m = n — k + 1. It may be noted that (3) is valid for each k = 0, 1, 
2,-°-:, n, while (5) is valid for k = 1, 2,---, n. 

Now let 
(6) z= (n ay P- 


} 
(k+1) ? 


Then 0 < z < 1 by (1). Let us write (4) in the form 


(7) = > a,2’, 


) 
where 


for s=0,1 


1 ) 
n+ Ww for 


(+ p45) 


Since 0 < a, S 1 for each s, it is clear from (3) and (7) that 


(9) A,(k) = | (2) ra] F £ | 


(8) 


em 2,3,°*-. 





BINOMIAL DISTRIBUTION FUNCTION 


is an upper bound for B,(k}. More exact upper bounds, and also lower bounds, 
are derived from (3) in Section 3. 

The case when n — « (and k varies with n so that (1) is satisfied for each n) 
is considered in Section 4. Since in this case a, — 1 for each fixed s (cf. (8)) it 
might seem plausible that | A, — B,|-— 0. However, | A, — B,| tends to 
zero if and only if B, tends to zero, and then A, is a precise estimate of B, , 
in the sense that A,/B, tends to one. It is also shown in Section 4 that A, can 
be modified by multiplying it with a certain factor so that the relative error in 
the modified estimate always tends to zero. It turns out that the normal ap- 
proximation to B, is an explicit divisor of the correction required by A, in the 
general case, so that an estimate of the relative error of the normal approxima- 
tion in the general case is obtained. This last estimate leads, in particular, to 
necessary and sufficient conditions in order that the relative error of the normal 
approximation tend to zero asn — &. 


3. Bounds for B,(k). The identity (3) suggests the following method for 
numerical evaluation of B,(k) to any desired degree of accuracy. Suppose that 


7 


we compute F only up to the first 7 + 1 terms and thus take 


(10) BY? (k) ses (2) al p> a2 | 


as an approximation to B,(k). Since k <s n, it follows easily from (8) that 


(11) 2 te te HS 


ay 
the inequality being strict unless t = 0. Consequently we have 


H+ BY’ (k 
(12) 1 — (aj42” ) < Fie < 1 


Since a,4,2’*' < a,’, it follows from (12) that the relative error in BY’(k) 
does not exceed the last term included in the sum on the right side of (10). 
Moreover, since 


(13) 640" < 2) 


it is easy to obtain, in advance of undertaking the calculation, an upper bound 
to the number of terms required to attain a specified relative accuracy. We 
note also that, since A,(k) is an overestimate of B,(k), (12) implies 


(14) 0 < B,(k) — BY’(k) < (ajyz**") min {A,(k), 1}. 


The preceding method of evaluation of B,(k), although applicable in general, 
is efficient only when z is appreciably less than one. A parallel method, with 
similar properties, can be based on (5). 

An alternative method of evaluation, which is useful even if z is nearly one, 





46 R. R. BAHADUR 


is the following. Let 


ee eee Re EP a nk 
"+2 —-Dk+27—-1)" "EF —Hk+2A? 


forr = 1,2,---, where m = n — k + 1. Then F defined by (4) can be repre- 
sented (cf., e.g., [1], Chap. XVIII) as a terminating continued fraction thus: 


(16) F 
Let 


l (2 by 
FY an . 2) = i 1: 
ia i tote 
ie be : Fr = - bh a t 


Pe Finn i d2 
F ie f= [i~telow - 


23 


We have 0 < b, < 1 for each r, and 0 <c, < 1 for r = 1, 2,---, m—1, 
by (1) and (15). It follows hence from (16) that 


PF < Fr < F® < PF < a < F 
(18) 
saa < F® < r® < F® < F™, 


The equality signs are included here only for the sake of literal accuracy; in 
fact, F“’ = F for r = 2(n — k) + 1, but all other inequalities in (18) are 
strict. 

Now let AS’ (k) = | (z)e'a* |-F for r = 1, 2,--- , where F is given 
by (15) and (17). We then obtain from (3) and (18) sequences of upper and 
lower bounds to B,(k), the general form of these bounds being A“’~” < A“ s 
A“ sBs A“ 5 AN™* 5 AW” for 2 = 1, 2,---. It should be 
noted that A{’(k) = A,(k), where A,(k) is defined by (6) and (9). 

Another method of using continued fractions to obtain bounds on B, which is 
based on (2) itself rather than (3), is given in Uspensky ([2], pp. 52-56). This 
method, which is attributed in [2] to Markov, does not appear to be generally 
known, and might therefore be described here. Let 


(19) 6 = mak — 1+) _ oa gk, 
ee Kh —24+2%K—-14¢2)q 3 " ) &—1 + Ws 2 


forr = 1, 2, ---+ , and let 


> 1 - aa ee By b 
G =; G iia 


(20) 
(3) D1 1 . (4) 1 1 #2 


“{-t4T= "Jo? 1 i 





BINOMIAL DISTRIBUTION FUNCTION 47 


Define M‘)\(k) = | (zeta |." for r = 1, 2,---, where G is given by 


(19) and (20). Suppose k > np + 1. It can then be shown [2] that, as with 
the A’s, we have M“” s M” = M“" 3s Bs M“*” s M“” s mM“ 
for s = 1, 2,---. Here B = M” for r = 2(n — k), but all other inequalities 


are strict. The writer conjectures that we always have 
(21) mM?» = or |B os mM” | <s |B So A” | 


for r = 1, 2,--- , the inequality being strict for r S (n — k). If so, Markov’s 
method of computation is superior, by one step, to the one described in the 
preceding paragraph. 

The following Theorem 1 shows that, if and k are large and z is appreciably 
less than one (i.e. if B,(k) is very small), then A,(k) is a good estimate of 
B,(k). The theorem is an expression of the fact that, under the conditions 
stated, even the first few continued fraction approximations to B are very close 


to B. 
» = l — = 1 k+1 
an(k) = (; + 1 n+ i) (+5) * 


Let 
where z is given by (6), and let 


(23) t,(k) = >a (0 s 2z,(k) S V/nq/p). 


"pq 


THEOREM 1: Given integers n and k satisfying (1), let B be defined by (2), A by 
(6) and (9), and a and x by (22), (23). Then 


(24) 1+ afz/(1 — z))(1 + a)” S A/B S11 + afe/(1 — z)}(1 + @ — 2)". 
In particular, 
(25) 1s A/B<14+2". 


Proor: Since bh} = 2 by (6) and (15), we have F” = 1/(1 — z) from (17). 
Hence A/B = F”/F by (3) and (9). Consequently, 
(i) (2 1) (3) 
F - ae. i ee 
Foe B Fe 
by (18). A straightforward calculation shows that the lower bound in (26) 
equals a{z/(1 — z)](1 + a)”, and that the upper bound equals 


alz/(1 — z)(1 +a —2+ 8)”, 


where 6 = b, — b,. Since 6 > 0, (24) therefore follows from (26). It is easily 
seen that alz/(1 — z)}(1 + @ — z)” S az/(1 — z)* < 2”, so that (25) fol- 
lows from (24). This completes the proof. 

In concluding this section reference may be made to certain bounds given by 


(26) 





48 R. R. BAHADUR 


Hodges and Lehmann [3] for any probability of the form Ex(t )p' ". The 


reader may verify that the bounds for B obtainable by taking a = k and 


ae p. 331, (3.1), say L and U, always satisfy L < A® <s B< 
Us A” 


4. Asymptotic estimates. The normal approximation. In this section we con- 
sider a given sequence of positive integers k, , kz, --- such that 
(27) npsk,sn (n = 1,2,---), 
and we study the behaviour of B,(k,) as n — &. Since the sequence {k,} re- 
mains fixed throughout the discussion, we abbreviate B,(k,) to B, , and A,(k,) 


to A, . Similarly, z,(k,) defined by putting k = k, in (23) is abbreviated to z, . 
Let N,, denote the usual normal er to B,, i.e. 


(28) N, 


Lae 
= dt. 
“fae 


Let 


(29) Ca = (tn + Vq/np)V/ 2x exp [42%]. 


TueoreM 2: (A,N,C,)/B, = 1 + &, wheree, >~Oasn— ~, 

This result is valid without any restriction on the sequence {k,} other than 
(27). As may be seen from the proof of Theorem 2, «, is at most of the order 
1/-/n if {z,} is a bounded sequence, and at most of the order 1/z,, if t, > ~. 
If, however, the sequence {z,} has finite limit points of arbitrarily large mag- 
nitude, the order of ¢, is indeterminate from the present proof. 

To prove Theorem 2, we note first that 


(30) A.C, = I(?, ‘) pg | (1 + mV q/np + +) V2xnpq exp [25], 


by a straightforward computation using (6), (9), (23) and (29). Suppose now 
that {x,} is a bounded sequence. In this case, 
(31) n—k,—- © 


as n — ©. Since k, certainly tends to infinity by (27), Stirling’s formula can 
be applied to the binomial coefficient on the right side of (30). This application 
shows that 


1 
(32) A.C, =1+0 (<2). 


Since, by the De Moivre-Laplace limit theorem, we have 


; 1 
(33) B, -—N, =0 (J) 


? The case when the no: ual approximation includes a continuity correction is discussed 
at the end of this section 





BINOMIAL DISTRIBUTION FUNCTION 49 


in any case, and since N, is bounded away from zero in the present case, it 
follows from (32) and (33) that (A,N.C,)/B, = 1 + O(1/V/n). 

Suppose next that z,-—~+ » asn-— >» «. In this case it follows from (28) and 
(29) by a property of the normal distribution ({4], p. 166) that 


(34) N,C, =1+0 (=). 
It is plain from (25) and (34) that (A,N,C,)/B, = 1 + O(1/z%). 

To treat the general case write r, = (A,N.C,)/B, for n = 1, 2,---, and 
let 1 be a limit point of the sequence {r,},0 S 1 S «. Then there exists a strictly 
increasing sequence of positive integers, say 7, i2,--: , such that r, — I as 
n—» « through the sequence 71. The sequence i surely contains a subsequence, 
say j:,j2,°°* , such that z, tends to a finite or infinite limit asn — © through 
the sequence j7. Hence / = 1 by the preceding two paragraphs. Thus 1 is the 
only limit point of the sequence {r,}. This completes the proof. 

CorRoLuaRy 1: Asn — @, the following four statements are mutually equivalent: 


(35) |A, — B,| 0, 
(36) A,/B,— 1, 
(37) B, — 0, 

(38) In —> ©, 


Proor: The equivalence of (37) and (38) is immediate from (28) and (33). 
It is evident from (25) that (38) implies (36). Since (36) always implies (35), 
it will now suffice to show that (35) implies (38). Suppose to the contrary that 
(35) holds, but that the sequence {z,} has a finite limit point, say a,0 S a < ~. 
Let i; < ig < +--+ be a sequence of integers such that z, —~ aasn— © through 
the sequence t. With n restricted to this sequence, B, is bounded away from 
zero, by (28) and (33); hence A, is also bounded away from zero by (35). It 
follows from Theorem 2 and (35) that A,(1 — N,C,) — 0. Hence N,C, — 1 
as n-— « through 7. This is a contradiction, since z — a implies ([4], p. 166) 
that NC — b, where b = 0 if a = Oand0 <b <1if 0 <a < o@. This com- 
pletes the proof. 

Next, define 


pty qv 
(39) fly) = (: + 9) (1 a ’) evitare) 
P q 


and 


o (= Ve 


for —p < y < q. Wri 


i Z. 
(41) ¥.=——p-= Tv 





50 R. R. BAHADUR 


Corouuary 2: If (31) is satisfied then 


(42) Nn/Ba = [f(yn)]"g(yn)(1 + &n), 
where «¢, ~O0asn— ~. 

This corollary to Theorem 2 follows from (30), (39), (40) and (41) by an 
application of Stirling’s formula. We omit the detailed calculation. The corollary 
is a generalization, in the present very special case, of estimates of the type 
introduced by Cramér and developed by Feller and Petrov. Petrov has re- 
cently given the best versions of such estimates [5]*. The generalization con- 
sists in replacing the condition 


(43) Yn 0 


of Petrov’s theorems with the much weaker condition (31). However, the 
order of the ¢, in (42) remains indeterminate. 

Corollary 2 is useful in certain applications [3], [6] where B, tends to zero 
very rapidly. Another application, with which the remainder of this section is 
concerned, is to the study of exact conditions under which 
(44) lim {R,} = 1 


nwo 


or at least 


(45) 0 < lim inf {R,} Ss lim sup {R,} < @~, 


non 


where 
(46) R,, = N./B, . 
Define 


(47) g(t) = log (1/t) + e_» 

for0 < t < 1. It is easily seen that, as ¢ increases from 0 to 1, ¢ increases steadily 

from —© to a positive maximum at ¢ = 4 and then decreases to zero. Let p 

denote the root of the equation g(t) = 0,0 < p < 4. (p = .2847). 
Corouuary 3: If p > 4 or p S p, then (44) holds if and only if 


(48) ny, = o(1). 
If p = 4, (44) holds if and only if 
(49) ny, = 0(1). 


If p < p < 4, (48) ts sufficient for (44), but may not be necessary; if, however, 
limy+0 Yn exists, then (48) is necessary for (44). 

* Petrov’s work was pointed out to the writer by Mr. Ranga Rao of the Indian Statistical 
Institute. The writer wishes to thank Mr. Ranga Rao for valuable suggestions and discus- 
sions during the preparation of this paper. 





BINOMIAL DISTRIBUTION FUNCTION 51 


In view of (41), conditions (48) and (49) are restrictions on the rate at 
which z, becomes large, if it does so at all.‘ It is therefore rather surprising 
that (44) can hold in the case p < p < $ even when {z,} contains a subsequence 
which tends to infinity very rapidly. The details of this exceptional case are 
given in the course of the following proof. 

To prove Corollary 3, suppose first that we have n — k, = m for all n, where 
m is a fixed non-negative integer. In this case it follows from Theorem 2 by 
(23), (30), (46) that 


(50) log R, = n¢(p) — log (*) — slogn + O(1), 


where ¢ is given by (47). The right side of (50) — + © or — © according as 
p > por p S&S p, so that (45) does not hold. 

It now follows that (31) is necessary for (45). For, if (31) is not satisfied, 
there exists an m such that n — k, = m for infinitely many n, and log R, is 


therefore unbounded, by the preceding paragraph. It will be shown ‘presently 
that in fact 


(51) lim sup lyn} <Q 


is necessary for (45). 
Let, 


(52) h(y) = log f(y) 


for 0 S y < q, where f is given by (39). We shall require the following easily 
verified properties of h regarded as a function of y. (i) In the neighborhood of 
y = 0, his of the order y’ if p * 4 and of the order y' if p = }. (ii) h(y) > ¢(p) 
as y — q, where ¢ is given by (47). (iii) If p 2 4, h is positive and steadily 
increasing in the interval (0, q). (iv) If p S p, where p is the number defined 
in the paragraph containing (47), then A is negative in the interval (0, ¢g). (v) 
If p < p < 4 then the equation h = 0 has a root, a say, in the interval (0, q); 
h is negative in (0, a) and positive and increasing in (a, q); the derivative of 
h is positive at y = a. 
Let us write 


(53) Wn = Nh(yn) + log g(yn)- 
It then follows from (41), (42), (46), (52) and (53) that 
(54) log R, = w, + 0(1) 


provided only that (31) is satisfied. 

We can now show that (51) is necessary for (45). Since (45) is already known 
to imply (31), it follows from (54) that it will suffice to show that w, = O(1) 
and (31) imply (51). First consider the case when p & p. In this case h(y,) S 


‘It is well known that (48) always implies (44): [4], pp. 178-181 and [5). 





52 R. R. BAHADUR 


0 and hence w, S log g(y,) for every n, by (53). It now follows by referring to 
(40) that (51) must hold, for otherwise lim inf {w,} = — «©. Now consider the 
case when p < p < 1, and suppose that {y,} contains a subsequence tending 
to q. Since in the present case h(y) tends to a positive limit as y — q, it follows 
from (40) and (53), using the hypothesis w, = O(1), that there exist positive 
constants ¢, and c such that log (q — ya) < c, — em for infinitely many n. 
Hence lim inf {n(q — y,)} = 0. This contradicts (31), since n(q — yn) = 
(n — k,) by (41). 

Since (51) evidently implies that (31) holds, and also that log g(y,.) = O(1), 
the following general criterion is now plain from (53) and (54): (45) holds if 
and only if (51) ts satisfied and n-h(y,) = O(1). By reference to the properties 
of the function A we see that this criterion reduces to (48) with o replaced by 
O in case p > 4 or p S p, and to (49) with the same modification in case p = }. 
The reduction of the criterion in the case p < p < } is also straightforward 
and is omitted. 

It follows easily from the preceding criterion and (53) and (54) that (44) 
holds if and only if (51) ts satisfied and w, = 0(1). This reduces to (48) if p > 3, 
or if p S p, and to (49) if p = 4. In case p < p < }, the present criterion re- 
duces to the following: 1) the sequence {y,} has no limit points other than 0 
and a = a(p), where a is the positive root of the equation h(y) = 0,0 < a < q; 
2) if 7, , %, --+ is any increasing sequence of positive integers such that y, — 0 
as n— through the sequence i, then (48) holds for n restricted to 7; and 
3) if ji, Je, *** 18 any increasing sequence of positive integers such that y, — a 
asn— through the sequence j, then 


(55) Y= at : +0 (‘) 
n 7 


for n restricted to j, where b = [h’(a)]™ log [g(a)]™’. 

It is clear that if (48) holds then 1), 2) and 3) are satisfied, 3) being vacuous. 
We shall now show that in general 3) is not vacuous, i.e. there are values of p 
and corresponding sequences {k,} of integers for which (55) holds for n restricted 
to some sequence 7. For any non-negative number r, let [r] denote the greatest 
integer contained in r. Then (55) can be written as 


(56) kn — {[b] + [n(p + @)}} = O+ i + & 


where @ is a constant,0O Ss @6< 1,0 S & = n(p+ a) — [n(p + a)] < 1, and 
¢, — 0. Suppose that p + a is irrational. Then, as is well known, each point 
in [0, 1} is a limit point of the sequence {£,}. Consequently, there exists a se- 
quence j; , jz, **: such that & — 1 — @asn— © through /j. If we let 


k, = [n(p + a)] + [6] + 1 


forn = ji,j2,°-: and k, =|np + ~/n|] (say) for all other values of n, it fol- 
lows that 1), 2) and 3) are satisfied. Thus 3) is non-vacuous at least when 
p + a(p) is irrational. It is not difficult to see that p + a(p) is a non-constant 





BINOMIAL DISTRIBUTION FUNCTION 53 


and continuous function of p, so that it does in fact assume irrational values 
as p varies from p to }. 

To complete the proof of Corollary 3, consider an arbitrary but fixed p in 
(p, 4), and suppose that lim,.. y, exists for the given sequence {k,}. Assume, 
contrary to the last statement in Corollary 3, that (44) holds but (48) does 
not. It then follow: from the necessary and sufficient conditions 1), 2) and 3) 
that in the present case (55) holds as n — © through the entire sequence 1, 2, 
3, -*- . Express (55) in the form (56). Since the left side of (56) is an integer, 
since ¢, — 0, and since @ and &, are in [0, 1) for each n, it follows that, for all 
sufficiently large n, 6 + & + €, = 0, or 1 or 2. Let L denote the set of all limit 
points of the sequence @ + £, . We then have L C {0, 1, 2}. 

The conclusion of the preceding paragraph implies that p + a cannot be 
irrational. Suppose therefore that p + a = u/v, where u and »v are integers 
such that 0 < u < v. Assuming that u/p is in its lowest terms, the limit points 
of the sequence &, are 0, 1/v, 2/v, --- , and (v — 1)/v. Hence 


L = {0+ (r/v):r = 0,1,2,---,0 — 1}. 


This implies, in particular, that @ is in L. Hence 6 = 0, or 1, or 2 by the pre- 
ceding paragraph; hence 6 = 0, since 0 S @ < 1 in any case. We now see that 
L = {(r/v):r = 0,1, +--+ ,v — 1}. This cannot be a subset of {0, 1, 2} unless 
v = 1. However, v = 1 implies 0 < u < 1 and is therefore a contradiction. 
This completes the proof. 

It may be of some interest to examine the modifications required in Corollary 
3 when the normal approximation includes a correction for continuity, e.g., 
when N, is defined by (28) but with z, = (k,n — 4 — np)/(npgq)’. It turns 
out that Corollary 3 requires no modification for this particular continuity 
correction. For certain more general ‘corrections’, the only modification is that 
(48) is not necessary for (44) if p < p < 4, even if lim y, exists. 

The conclusions just stated are readily derived as follows. Let {c,} be a bounded 
sequence, and let 


a eo Dis _ & 
Vnpq " -Vnpq’ 
: a e” dt, 
Ry = N3/Bn. 
We wish to know whether 
(58) lim {Rx} = 1. 


n~n 


nv* 


Suppose for the moment that z, > «. Then 22 also +, and it follows easily 
* 


from [4], p. 166 that V,/N, = (1 + «) exp (—c,y,/pq), where «, — 0. This 
asymptotic formula is surely valid if {z,} is bounded, for then y, — 0 and 


N,/Ns 1. 





54 R. R. BAHADUR 


It follows therefore (cf. the paragraph preceding Corollary 1) that the formula 
is valid in general, i.e. 


(59) log Ni = log Nn + (¢nyn/pq) + o(1) 
in general. 


Since {c,y,} is a bounded sequence, it follows from (46), (57), (59), and the 
proof of Corollary 3, that (58) holds if and only if 


(60) lim sup Lyn} <q, h(yn) + log g(yn) + (Cn¥n/pq) = o(1). 
If p > 4, or if p S p, (60) reduces to (48). (60) reduces to (49) if p = 4. Sup- 
pose next that p < p < 3, and that c, is a constant, say c, = c. In this case, 
(60) reduces to conditions 1), 2) and 3) of the paragraph containing (55), but 
with b replaced by (— logg(a) — ca/pq)/h’(a). Since the value of 6 is im- 
material to the arguments following (55), we conclude that in the present case 
(48) suffices for (58) but may not be necessary, unless lim y, exists. 

It remains therefore to consider the case when p < p < 4 but the c, are not 
constant. Here (48) is sufficient for (58), but is not necessary, even if lim y, 
exists. Indeed, if k, = n(p + a) + b, for each n, where {b,} is a bounded se- 


quence, and we take c, = —(pg/a)(b,h’(a) + log g(a)), then (60) and there- 
fore (58) is satisfied. 


REFERENCES 

{1] H. 8. Wax, Analytic Theory of Continued Fractions, D. Van Nostrand Co. Inc., New 
York, 1948. 

(2) J. V. Uspensxy, Introduction to Mathematical Probability, McGraw Hill Book Co., New 
York, 1937. 

(3] J. L. Hopees, Jr., ano E. L. Lenmann, ‘“‘The efficiency of some nonparametric com- 
petitors of the ¢ test,’’ Ann. Math. Stat., Vol. 27 (1956), p. 324. 

[4] W. Fevier, An Introduction to Probability Theory and Its Applications, Vol. I (second 
edition), John Wiley and Sons, New York, 1957. 

(5) V. V. Perrov, ‘‘Generalization of Cramér’s limit theorem,’’ Uspekhi Mat. Nauk., Vol. 
9 (1954), pp. 195-202 (in Russian). 

[6] R. R. Banapur, “Simultaneous comparison of the optimum and sign tests of a normal 
mean,”’ Contributions to Probability and Statistics: Essays in Honor of Harold 
Hotelling, Stanford University Press, to be published in 1960. 





ON THE MIXTURE OF DISTRIBUTIONS* 


By Henry TEICHER 
Purdue University 


Summary. If § = {F} is a family of distribution functions and y is a measure 
on a Borel Field of subsets of § with u(*) = 1, then f F(-) du(F) is again a 
distribution function which is called a u-mixture of F. In Section 2, convergence 
questions when either F,, or us. (or both) tend to limits are dealt with in the 
case where § is indexed by a finite number of parameters. In Part 3, mixtures 
of additively closed families are con.. iered and the class of such u-mixtures is 
shown to be closed under convolution (Theorem 3). In Section 4, a sufficient 
as well as necessary conditions are given for a u-mixture of normal distributions 
to be normal. In the case of a product-measure mixture, a necessary and sufficient 
condition is obtained (Theorem 7). Generation of mixtures is discussed in 
Part 5 and the concluding remarks of Section 6 link the problem of mixtures 
of Poisson distributions to a moment problem. 


1. Introduction. Let ¢ = {F} be a family of one’ dimensional cumulative 
distribution functions (c.d.f.’s), and let 9W = {yu} be a class of measures defined 
on @, a Borel Field of subsets of 5, with u(F) = 1, all we OM. (@ may be taken 
to be the smallest sigma-Algebra containing sets A,, = {|F | F(z) S y, Fe S}). 
Then, [9], [11], f.g(F) du(F) is defined in the usual manner for measurable 
mappings g of $ into the real line. If g = g.(F) = F(z), this becomes 


(1) w+ 8t) = [ P@ dyu(F). 


The resultant distribution function H will be called a “mixture” or more spe- 
cifically a y-mixture of $ providing the “mixing measure” y does not assign 
measure one to a particular member of . Thus, the term mixture’, as employed 
here, signifies a genuine weighted average of c.d_f.’s. 

For a stipulated $, the family 3 = 3C(S) of mixtures H, swept out as yu varies 
over 3M, will be called the class of I%-mixtures of F or (if 9M is definitive in some 
sense) simply the class of mixtures of 5. 

In particular, the family $ may be indexed by a finite number of parameters 
a, @, *-* , @m each a; varying over the real line, that is, 


& = (F(z; a, a2, -** , am). 
Received January 9, 1959; revised October 10, 1959. 
* This work was supported in part by an Office of Naval Research Contract. 
1$ may also be taken to be a family of r-dimensional ¢.d.f.’s for any positive integer r. 
? Actually, (1) corresponds to what Bourbaki [3] calls “integration of measures’’. How- 
ever, except for convergence, the questions considered here have no contact with those of 
[3). 


55 





56 HENRY TEICHER 


Let G = {|G(a, a2, «+: , am)} denote the class of m-dimensional c.d.f.’s and 
F(x; a, @, *** , @m) be measurable on (m + 1)-dimensional Euclidean space 
R”*'. Then, 9 may be taken to be the class of Lebesgue-Stieltjes measures 
{uo} on R”™ induced by G eG and (1) becomes 


(2) H(z) = F(a; ay, a2, -** , Gm) AG(ay, a2, --* , Om). 
a0 

Similarly, ¢ may be {F(z; am, ---, am)} = {F(x; a)} where now 
a = (a, °** , @m) is restricted to RT, some measurable subset of R”. However, 
since ye will assign zero measure to R™ — RT, one may define F(x; a) to be 
an arbitrary c.d.f. for ae R” — RT. Then (1) again takes che form (2), the 
class G (or 91) being suitably restricted. 

If |F (2; «)} is a discrete family whose discontinuity points are independent 
of a, -**, @m, then the resultant distribution under mixture on a will be 
discrete, inheriting the common points of discontinuity. The situation may be 
otherwise if the discontinuity points vary with a. Thus, if m = 1 and F(z; a) 
has unit saltus at zc = a, H(z) is continuous if, and only if G is, since H = G. 
In general, zo is a discontinuity point of H(z) if and only if the a-set for which 
F(x; a) is discontinuous at 2 has positive u.-measure. 

On the other hand, if F(z; a) is absolutely continuous for every a ¢ R™ then 
f(z; a) = 0/dxF(2x; a) is measurable on R”*', whence, by Fubini’s theorem, 
the resultant mixture H(z) is absolutely continuous with a density A(z) given 
by fe» f(x; a) dG(a). In other words, for an absolutely continuous family 
{F (2; a)}, h is a we (or simply G)-mixture of {f(z; a)}.’ Conversely, if a proba- 
bility density function (p.d.f.) h(x) is (merely almost everywhere) a G-mixture 
of a family of p.d.f.’s {f(2; a)}, with f(z; a) measurable on R”™’, its c.d.f. H(z) 
will be a G-mixture of the corresponding family of c.d.f.’s |F(z; a)}. 

It follows directly from Theorem 5 of [13] or the fact that F(x; a) and G(a) 
determine a joint distribution that if H is a G-mixture of F (of the form (2)), 
then its characteristic function g(t) is a G-mixture® of the class $* = {g(t; a)} 
of Fourier transforms of the elements of S. Similarly, any existing moment of 
H is a G-mixture of the family of moments (of the same order) of F (which 
exist except perhaps for a set of wg-measure zero). If G(a,---, am) = i~1 
G,(a;), the mixture will be termed a “product measure” mixture. Analogously, 
we may speak of a discrete or absolutely continuous mixture according as 
G(a,, +++, @m) (or w) is a discrete or absolutely continuous c.d.f. (measure). 
Finally, a finite (countable) mixture is one for which u is discrete and assigns 
measure one to a finite (countable) set of points. 

A question of importance concerning mixtures is that of unique characteriza- 
tion. That is, for a specific family ¥, which distributions H uniquely determine 
the mixing measure yu. In this connection, we give the following 

Definition: A y-mixture of F, say H, will be called “identifiable” if, for any 


3 Here, we have tacitly extended the terminology “‘u-mixture of ¥’’ to cases where the 
family ¥ has as elements functions f(z; a) which, for each ae R™, are not c.d-f.’s in the 
remaining variable. 





MIXTURE OF DISTRIBUTIONS 57 


probability measure y*, the relationship H(z) = f F(x) du(F) = J F(x) du*(F) 
implies » = yu*. If every member of a class KH of u-mixtures of & is identifiable, 
K itself will be called identifiable. 1 

In numerous problems in probability and statistics, one is interested in the 
distribution of a random variable X but knows only the conditional distributions 
of X given the values of some auxiliary random variable Y. Then the desired 
distribution of X is simply a mixture of the known conditional distributions. 
Similarly, one may know the limit of a sequence of distributions of random 
variables X, for all fixed values of other random variables Y, as well as the 
limiting distribution of the Y, (say G), whereas one requires the limiting distri- 
bution of the X, . Under certain conditions, (see especially Theorem 2 and 
propositions A, B of Section 2) the latter will be a G-mixture of the former. 

It is surprising, therefore, that, except for the special case F(z;a) = F(x — a) 
of convolution, general properties of mixtures have received relatively little at- 
tention. A treatment of mixtures appears in [13] and specific mixture problems 
are dealt with in [8] and [14]. The compound Poisson distributions (see e.g. [8]) 
are precisely mixtures of Poisson distributions which are necessarily (by a prior 
remark) discrete distributions with jumps at the non-negative integers. 

Mixtures of distributions are of interest for reasons other than those already 
cited. For example, in the course of determining limit distributions of sums of 
interchangeable random variables [2] mixtures of normal distributions are en- 
countered and this provides one of several motivations for a study of such 
creatures. 


2. Convergence of mixtures. If G(a), Gi(a), k = 1, 2, --- , is a sequence of 
c.d.f.’s such that Gi(a) converges to G(a) on all continuity intervals of G 
(equivalently, limy.. f f(a) dGi(a) = Jf f(a) JG(a) for every bounded continu- 
ous function f(a)), we write G, = G. 

Let F(z; a), F,(2; a), n = 1, 2, ---, be a sequence of families of c.d.f.’s 
(all functions of z, a are supposed measurable on R™*') such that F(z; a) => 
F(z; a), all ae R”; similarly, let Gi(a) = G(a) and define Hy, (H) to be a 
ua, (ue)-mixture of {F,(2; a)} ({F(x; a)}), that is, 


Hula) = [ Pata; «) le), Hs) = [ FG; a) dG(a). 


In case G, = G (F, = F), we write H,. (H.,) for H,, . As indicated in Section 1, 
it is the convergence of the diagonal sequence H,,, => H that is of special interest 
in probability and statistics. However, it seems pertinent to cite more general 
results. (A related but different convergence question is treated in [17a]. ) 

We first consider the simple cases of H,. and H., . It follows from the domi- 
nated convergence theorem and a remark of the preceding section concerning 
the relationship between discontinuities of H and those of the family F that 
H,,. => H. On the other hand, H., need not converge to H as the following example 
shows: 

Take m = 1 and let G.(a) be a step function with jumps of Kie™ /j1 at the 





58 HENRY TEICHER 
points a = —k + j/k, j = 0, 1, 2, --- . By a classical result, 
lims..Gi(a) = (1/V2e) | et? dy, 


Choose $ so that F(x; a) = F(z) or F2(z) as a is rational or irrational. Clearly 
H, = F,, while H = F,. Thus if F; + F:, Hu H. The source of trouble 
is thereby indicated, leading to 

Tueorem |: If, for each continuity point x» of H(x), 


pala | F(x ; a) is discontinuous} = 0, then Hy, = H. 


Proor: The theorem is an immediate consequence of an extension of the 
Helly-Bray theorem, which, in turn, follows directly from known results. For 
example, Theorem 2.1 of [1] (see also [4]) insures that if, for a sequence of 
random vectors X, defined on a probability space, 


Gi(a) = P{X, < a} > Gla) = P{X < al}, 


then F,(a) = P{h( Xi) < a} = P{h(X) < a} = F(a), provided only that the 
set of discontinuities of the measurable function h has ue-measure zero. But if 
h is also bounded, the rth moments of h( X,) converge to the rth moments of 
h(X). For r = 1, this shows, for any bounded measurable function h(a) whose 
discontinuity set has w¢-measure zero, that G, = G implies 


kon 


lim [ h(a) dGs(a) = tim [ vary) = | var) a L h(a) dG(a). 


Applying this result to h(a) = F (2; a), the theorem follows. 

In the double sequence case, it follows along the lines of [10, Theorem 26, 
p. 284] that, if the total variation V[G, — G] — 0, then Hy, = H (n,k — ~). 
But, as in the example, G, = G is compatible with V[G, — G] = 2. 

Now if ® denotes the class of Borel sets of R™ and G(a), Gi(a) are abso- 
lutely continuous with densities g(a), g(a), k = 1, 2, ---, such that 


ge(a) — g(a) 


pointwise, then 
VIG, — G) = 2 sup | we,(B) — uo(B) | = 2 sup if gx(a) da -| g(a) da|—0 
Be® BeB B B 


since [15] limsce fege(a) da = JSeg(a) da uniformly in B. Consequently, 
A. If F,(x; a) => F(x; a) and g(a) = Gi(a) + G’(a) = g(a) for all a ¢ R”, 
then Hy. => H. 
B. Let Gla), Gila) be discrete c.df.’s with Gi(a) = Gla), F(z; a) => 
F(z; a) all a. If, for every point a’ of positive mass of G, the mass of G;, 
at a’ converges to that of G, then Hy = H. 
Proposition B follows in the same fashion as A since a slight extension of Scheffé’s _ 
theorem [15] yields an analogue for discrete c.d.f.’s under the stated assumption. 





MIXTURE OF DISTRIBUTIONS 59 


We note that the additional proviso is automatically insured by prior assump- 
tions if the discontinuity points of G have no finite limit point and are not 
themselves limit points of discontinuities of [G,]. 

Next, the rather stringent condition ViG, — G] — 0 may be replaced as 
follows: 

Tueorem 2: Let Gi(a) = G(a), and let RT be a measurable set of R™ with 
uel RT} = 1. In order that Hy, = H(n, k — ~) it is sufficient that for each conti- 
nutty point xo of H(z) 

(i) pola | F(a ; a) is discontinuous] = 0 

(ii) limy.e Fa(zo; a) = F(x; a) uniformly in S-RT for every closed bounded 

a-rectangle S of R”. 

Proor: For arbitrary « > 0, choose the “continuity rectangle’ A such that 

uo(A) > 1 — ec and let A denote its closure. Then if B = R” — A-R7, 


| Hau(z0) —H (20) | | [Pale 50) — F(z 5 «)] dGs(a) | 


+ | [F(20;«) dG(a) — [F(20;«) dG(a)| sf | Fa(x0; a) —F(2050) |dG,(a) 
airy 


+2 dG,(a) + | | F(a. 5) aGy(a) — [ F(x; 0) dG(a) | Set sete = 6 
B 


for sufficiently large n and k by Theorem 1 and (ii). 


3. Mixtures of additively closed families. We recall [18] that a family ¢ = 
{F(a; a)} = {F(2; a, «++, am)} where a; varies over an additive abelian 
semi-group D;,j = 1, 2, «++, m, is called “additively closed” if for every ad- 
missible a, 8, 


(3) F(z; a) «F(z;8) = F(z; a+ 8B) 


where, as usual, « denotes the convolution operation. The families of normal, 
Poisson, binomial and many other distributions are encompassed within this 
definition. 

Suppose that D,; denotes either R’ or some measurable subset of R' that is 
an additive Abelian semi-group, 7 = 1, 2, --- , m, and that wo assigns measure 
one to D = D, X Dz XK --- XK D,. Then, under (3), fp F(z; a) dG(a) isa 
mixture of the additively closed family § = {F(z; a)}. 

Tueorem 3: Let H; be a G,-mizture of the additively closed family $,i = 1, 2. 
Then the convolution H, + H, is a (G,; * G:)-mizture of 5. Conversely, if for some 
r 2 1 and all G, , G, having exactly r points of positive mass, the convolution of a 
G,-mizture of § with a G.-mizture of F is a (G, * G,)-mizture of 5, then F is addi- 
tively closed. 

Proor: Let H = H, + H,, G = G, « G, and denote by ¢(t), o:(t), e:(t) and 
g(t; a) the characteristic functions (c.f.’s) respectively of H, H, , H, and F(z; a). 
Since, as remarked earlier, ¢;(t) is a G,-mixture of {¢(t; a)}, i = 1, 2, we have 





HENRY TEICHER 
sates ely ei / g(t; a) dG,(a) | y(t: B) dG.(8) 
D D 
/ / o(t; a + 8) dG,(a) dG,(8) 
D#D 
[ [ ett: deur — 8) aax(e) 
D#¥D 


/ o(t; y) dG(y), 

D 

employing (3) and Theorem 5 of [13]. In view of the one-to-one correspondence 
between c.d.f.’s and c.f.’s, this implies that H is a G-mixture of § = {F (2; a)}. 


In proving the converse, we suppose r = 2 (and the distributions discrete ) 
for brevity’s sake. By hypothesis, 


/ ¢(t; a) dG,(a) [ g(t; B) dG.(B) = / o(t; y) dG(y), 
R™ R™ R™ 


where we may choose 
He, (a0) = 1 — po, (ar) 


He, (Bo) = l - = 1 om ue, (81). 


Since G = G, * G,, ¢(t; a; + 8;), i,j = 0, 1, belongs to the class of transforms 
of , i.e., the domain D of a is an Abelian semi-group and 


|" Stead a tot; an) || = g(t; Bo) + : o(ts 6) | 
nm n n n 


= (” z t) w(t ay + Bo) + — le(t; ao + Bi) + elt; ar + Bo)] 


n? 


1 
+ wi Pht a, + ;). 


Letting n — ~<, we see that ¢(t; ao)-¢(t; Bo) = ¢(t; ao + Bo). The conclusion 
now follows from the fact that a and 8» are arbitrary points of R” (or some 
measurable sub-region D thereof). 

Corrouuary 1: An infinitely divisible mixing (G) of an additively closed family 
(F) yields an infinitely divisible mixture (1). 

Coro.uary 2: The convolution of two compound Poisson distributions (see 
Section 1) is again a compound Poisson distribution whose mixing c.d.f. is the 
convolution of the two given mixing c.d.f.’s. 


This was proved by Feller [8] and follows from Theorem 3 by, taking m = lI, 
D, = [0, ©), ¢(t; a) = exp fa(e'* — 1)}. For this same choice of ¢(t; a), Corol- 





MIXTURE OF DISTRIBUTIONS 


lary 1 appears in [lla]. Similarly, for m = 1, 
D, = [0, ©), g(t; a) = exp {—a|t| "I, 0 <& € @, 
we have 


Coro.uary 3: The convolution of two mixtures of symmetric stable distributions 


of fixed exponent 8 is again a mixture of the same type with mixing c.d.{. the con- 
volution of the given mixing c.d.f.’s. 
Let m = 2, D, = R’, 


D, = (0, ©), ¢(t; a, a2) = oft; 8, 0°) = exp {it — o°t*/2}. 


By an extension of terminology, a mixture of normal distributions (on both 
parameters) might be called compound normal. Then Corollary 2 remains valid 
if everywhere therein the word ‘Poisson’ is replaced by the word “normal”. 

In [18], it was shown that, except for a pathological case (arising when a 
varies in a continuum and which may be excluded by a slight additional assump- 
tion), if § = {F (x; a)} is additively closed, then ¢(t; a) is of the form 


[tr , (f(t); 
specifically, for m = 1, 


(4) ¢(t;a) = [p(t)}", a2, 


where ¢(t) is a c.f. independent of a. In order to avoid detailing the conditions, 
let us say that $ ¢ CG if (4) holds. Most of the classical one-parameter families 
of distributions belong to C : 

We pose the question whether a G-mixture of 5, with $ ¢ Ci , may itself be 
an element of 5. If & is the additively closed family of unitary distributions, 
i.e. g(t; a) = fe“]*, all real a, then the very definitions of c.f. and mixture show 
that any non degenerate c.d.f., H, is a mixture (in fact an H-mixture) of ¥ and 
hence not an element of F. The following theorem shows this situation to pre- 
vail under considerably less trivial circumstances. 

Tueore 4: Take m = 1 and let § = {F(x; a)} eC. If 

(i) ¢(t; a) is real-valued (for real t) and lime. ¢(t) = 0 for ty finite or in- 
finite or 

(ii) {F (2; a)} has finite second moments and non-zero first moments then no 
G(a)-mizture of F belongs to S. 

Proor: If a G-mixture of $ is an element of 5, say with c.f. W(t) = [¢(t)]’, 
y 2 0, we have 


(4.1) [ eior aG(a) = = felt)" 


or 


(4.2) l [e(t)|** dG(a) +/ {e(t)}*”* dG(a) 





62 HENRY TEICHER 


In case (i), if Gy +) = 0, letting t — t im (4.2), we reach the contradiction 
1 = 0. Since G(y +) — G(y —) = 1 is precluded, G(y —) = p > Oand we 
may choose «, 0 < « < y such that G(y — «) > 0. Also, if & is finite, we may 
take it to be the smallest (positive) zero of ¢(t) in which case g(t) > 0 for 
|t| < t whether & is finite or infinite. Thus, for |t| S t&, from (4.2) 


= / [o(t)]*-" dG(a) = [e(t)1"G(y — «) 


which is clearly impossible for ¢ sufficiently close to & . 
In case (ii), since the c.d.f. of ¥(t) has finite second moment and the first 
and second moments of F(z; a) are — iag’(0) and 


— {a'ly’(0)F + ale”(0) — (¢’(0))*}}, 


it follows that G has its first two moments finite whence differentiation under 
the integral sign in (4.1) is permissible. Thus, for ¢’(t) ¥ 0, 


[W'(t)V/le'(t)] = f ale(t)|* “dG (a) 
and for ¢(t)-¢’(t) ¥ 0, 


g'v” —WV',” Vv’ v\ - 2 \a—-2 avy a! , 
as) Te ¥ _ (¥) = fettotorr aa — (f aletorr aa%e)). 


ee 
When ¥ = ¢’, (4.3) becomes 


(4.4) ye" *(t) — be” “(OF = [een dG(a) — (fatecor ag(a)) 


for all ¢ such that ¢(t)-¢’(t) ¥ 0. 
Since ¢’(0) # 0, we may substitute ¢ = 0 directly in (4.4) obtaining 


0 = fla — f adG(a)} dG(a) 


which implies that G is a unitary distribution. 

In the case ¢’(0) = O but fa’ dG(a) < @«, there exists an interval [0, e) 
in which ¢’(t) # 0 since the contrary would imply ¢(t) = 1 in [0, e), hence 
g(t) = 1 = ¢(t; a) and ye degenerate. Consequently, (4.4) holds for a sequence 
of t-values approaching zero and, by continuity, at zero also, again yielding the 
prior contradiction. 

Note that Theorem 4 does not preclude f [¢(t)]*dG(a) = g(at) for some 
real a; here G(a) = G,(a). Taking ¥(t) = e¢(at) in (4.3), we see that under 
(ii), a = 1 (with equality rendering G degenerate). Example 1 of section 5 
illustrates this possibility. 

Coro.uary: No mixture of symmetric stable distributions with fixed exponent 8, 
(0 < B S 2) is a symmetric stable distribution with exponent 8. 

On the other hand, Wintner [20] has shown that any symmetric stable dis- 
tribution of exponent 8 (0 < 8 < 2) is a “mixture” of symmetric stable dis- 
tributions of some fixed larger exponent + (with a non-finite “mixing measure”’). 





MIXTURE OF DISTRIBUTIONS 63 


When m = 2, there is much greater latitude for a G-mixture of {|F(2; a)} 
since in the representation of g(t; a), fi(t) and f(t) need not both be c.f.’s and 
even when they are, a; or a, may assume both positive and negative values. 
The next section deals with this case when § is the two-parameter family of 
normal distributions. 


4. Mixtures of normal distributions. On occasion, the underlying population 
(distribution) of interest to the statistician is not prescribed to be normal but 
rather is generated by selecting one of a collection of alternative normal distri- 
butions according to some probability mechanism or scheme. If the resulting 
mixture of normal distributions is itself normal, many classical results may be 
utilized. 

Consider, therefore, mixtures of the two-parameter family of normal distri- 
butions and under what circumstances, i.e. for what measures uw, such mixtures 
may themselves be normal. 

Define (x) = (1/+/2n) Tw" dy and ®(z; 6, 0) = (x — 6)/o, where 
@¢R' and oe (0, ~). The question arises whether “degenerate normal distri- 
butions” (viz., ¢ = 0 which is interpreted as @(z; 0,0) = 0, ¢ S 6 and 
(zr; 6,0) = 1, x > @) should be mixed; these will be banned because if yp as- 
signs measure one to {o* = 0} an arbitrary distribution may be thereby obtained. 

If G(@, 0°) is a e.d.f. which is zero on the lower half and boundary (0° = 0) 


of the (6, 0°) plane and y(@, a) = pe is the corresponding measure on the Borel 
sets of R’, let 


(5) H(2x) = [ eae, a’) dG(6, 0°) = co) dy. 


The class 3 of mixtures (5) of normal distributions is by no means identi- 
fiable (see section 1 for definition); this will be apparent momentarily if it is not 
already so. On the other hand, the class %,, (respectively, %,) of mixtures on 
means only (respectively, on variances only) is identifiable. 

For X,, may also be characterized as the class of c.d.f.’s containing a fixed 
normal factor with mean zero and specified variance, say unity. Since the normal 
c.f. is non-vanishing, ® « G, = & *« G, implies G,; = G, which is therefore tanta- 
mount to the identifiability of 3,. (Of course, (2; 0, o1) * G(x) = 
(x; 0, 03) * Ge(x) has solutions G, * G; if ; # o2 , but this is not the issue). 

If H ¢ X,, we may suppose without loss of generality that its mean is zero, 
so that its c.f. is ff exp |—fo"/2} dG(o"), whence the identifiability of 3c, is 
an immediate consequence of the uniqueness theorem for Laplace transforms. 

In returning to a consideration of 3, we note that by integrating (5) over the 
regions 6 < xz and @ 2 z, wl@ < xz} < 2H(z), wi@ = xz} < 2[1 — H(z); that 
is, the “tails” of the distribution of means (@) are dominated by those of H.* 

Since @ and o° are Euclidean “random variables”, a conditional distribution 
of @ given o° exists. Denote it by G.(@) and let ¢,2(t) be the corresponding c.f.; 


‘If the distribution of means is degenerate, equality may hold in the second relation- 
ship for some z. 





64 HENRY Tr°CHER 


also, define G,(o”) = plo’ |o° < o”}. Then (5) may be rewritten as 


H(z) = [ [-e ( = ‘) dG.2(0) dG,(o") 


= § [®(x; 0, 0°) * G.a(x)] dGy(o’). 


(6) 


Since H is a G,-mixture of the bracketed family of c.d-f.’s its c.f., say g(t), is 
given by 


(7) g(t) = z oP? o4x(t) dGi(o’). 


Equating the real parts of (7), we note that, if H is a symmetric c.d.f., ie 
g(t) is real-valued (e.g. normal) and a y-mixture of {@(z; 6, o’)}, it is also a 
u-mixture of {@(; 0, o)}, where u is such that the c.d.f.’s G,2(@) are symmetric. 
We turn directly to the case H(z) = ®(2z; 4, a3) for fixed %, 7 > 0. The 
fact that o’ then has a bounded spectrum (see Theorem 6), together with the 
domination of the tails of the @ distribution by those of the normal (see the 
paragraph preceding (6)), insure that G(@, o°) has finite moments of all orders. 
If &(x; %, 03) is a G-mixture of {@(z; 6, o°)}, then, replacing z by oor + %&, 


(8) (2) = [ef ce | a0, aa Le = ‘) dG(6, 0), 


o/ 9% 


and we therefore suppose without loss of generality that 6 = 0, a = 1. Taking 
(bilateral) Laplace transforms in (8), we have 


(9) elt on / (e708 /2)—te dG (6, a’). 
R? 


Multiplying (9) by exp {8s'/2}, replacing s by s(1 + 8)~ and changing integra- 
tion variables shows that, if (x) is a G-mixture of {@(z2; 8, o’)}, it is likewise a 
Gs-mixture of {(z; 0, o°)}, where G3(@, 0”) = G(6+/1 + B, o (1+ 8) — 8), 
8 = 0. (Note that since the mass of G is contained in the strip 0 < o° < 1 
(Theorem 6), the mass of Gz is constrained to lie in the strip 8/(1 + 8) S 
o < 1.) Thus, contrary to the Compound Poisson case, specification of H by 
no means determines the mixing measure y. 

Now the representation (6), with H(z) = (2), elicits the obvious “solu- 
tions” G,2(@) = (6;0, 1 — o’), G,(o") = arbitrary c.d.f. on (0, 1). On the other 
hand, the following easily proved 

Lemma: ¥(x) = (1 + d)®(xz; &, 01) — d®(x; &, 03), d > 0, isacdf. if 
and only if a. < o,, and 


1 (6, — 6;)? 
1 | a ae 
d- m exp i oe & e } 1, 


shows that 





MIXTURE OF DISTRIBUTIONS 


O¢s o 


G;(0") (1 +d)", o<o sa 


l,o’ > o% 

G,3( 0) (1 + d)®(6;0, 1 — oi) — d®(0; 6, 03 — a3) 
0,9s 6 
10> 6 


G.3(6) = 


also constitute solutions when d, 6, o;, o: are as prescribed. We have thus 
proved (with the exception of the parenthetical statement which follows from 
Theorem 6) 

Tueorem 5: Suppose (as the conclusion requires) that ule? | o > of} = 0. 
Then a sufficient but unnecessary condition that a y-mizture of normal distributions 
be normal with mean % and variance o} is that the conditional distribution of @ 
given o° be normal with mean % and variance 05 — o° for all values of o° for which 
it is defined. 

Naturally, (8) imposes contraints on the distributions of means @ and vari- 
ances o and we now proceed to establish some of these. 

TxHeorem 6: Jn order that a p-mizture of normal distributions be normal with 
mean 0 = 0 and variance o} = 1, it is necessary that 

(i) plo? |e > 1) = 0 = pfO, o |e” = 1, 0 ¥ O}. Hence it may be supposed 
that plo’ |o* = 1} = 0. 


| f ] 
a | @ 
(ii)® uo, “| 4> c| #8 “#"3< -c} > 0, all C > 0, 


- ¢ 
(iii) 3 ep {aq — A du = &, 
(iv) wl@| |e) < &*}-ulO, 0” | & < o” log, 1/0} > 0, 
(v) the @-spectrum of » not be confined to a subset of numbers in arithmetic 
progression; further, for all integers m (all real b) and all integers n 2 1, 
("Pay i+ 37) 
uia|yve U E =e vas 


j= 


8n ” 8n 
where ¥ signifies the fractional part of y and either y = 6 — m/n ory = bé. 
Proor: Rewrite (9) as 


(6.1) 1 - | elet-1)08 2} dy. 
Rr? 


« i, 


) 


1 + « has positive u-meas- 


Suppose now that for some « > 0, B, = {e*|o’ 2 
= B,- {e| |@| s Ch, whence, for 


ure. Then for sufficiently large C, so does . 
s real, (6.1) implies 


1 > eT du > ef Cll Lt A}. 


’ The writer cordially thanks his colleague Prof. Michael Golomb for helpful conversa- 
tions relating to an early version of (ii). 





66 HENRY TEICHER 


For sufficiently large s, this is manifestly impossible. Thus y|B,} = 0, all « > 0, 
which implies the first equality in (i); the second follows in similar fashion from 
(6.1). Further, if ue assigns measure po > 0 to the point @ = 0, o = 1, sub- 
tracting pé>(x) from both sides of (8) and dividing by 1 — po, a new related 
mixture y»* is obtained for which po = 0. Generality is clearly maintained in 
supposing » = y*. 

To prove (ii), let W = {@, 0° |0 < o° < 1} and observe from (6.1) and (i) 
that for all real s 


( 2 
a (l—o? a2 /2—68 (1 — ¢ ) 6 ) 
(6.2) [ é du = [ exp \ = " 2 . | + i oo} diy ’ 


where du; = exp {@/2(1 — o’)} dy. If, now, for some C > 0, 


ul6,o° | 0/(1—o) > C} =0, 
then s < 8s, < — C would imply 


[ exp" > a’) |» + = al} du, 
efron E fae tT) 


in violation of (6.2). The remaining part of (ii) is analogous. 

If (iii) did not obtain, it would be legitimate to let s + ~ within the second 
integral of (6.2) and conclude that 1 = 0. 

As remarked in a more general context in Section 1, (8) implies a correspond- 
ing relationship for densities (here multiplied by 1/27), namely 


(6.3) ga «, [ 1 -« @)2/202 ec 
K 


20 


which, evaluated at « = 0, becomes 


(6.4) 1 = i L inet yy 
rR? 0 
The integrand of (6.4) cannot be less than one on a set of u-measure one which 
is equivalent to yu} 6, o\@ <a log. o} > 0. The remaining portion of (iv) 
follows by noting that the first part implies pio | \@is e*} > 0, and conse- 
quently ule | \@| <e} > 0, since the negation of the latter would entail 
ul, o | |@| = e+, a = &'} = 1, which is easily seen to be incompatible with 
(6.3). (iv) also follows from f [®[(2 + @)/o] + ®[(x — 6@)/o] — 2(x)| du = 0. 
Next, set s = it in (9), obtaining 


(6.5) ves ee [ fern dG(@, a’). 
R 


If, in violation of the first statement of (v), the @ spectrum is concentrated at 
points a + kb where a, b ~ 0 are real (b = 0 requires a = 0; this case is ruled 





MIXTURE OF DISTRIBUTIONS 67 


out by the corollary to Theorem 7) and k varies over some subset of the integers, 
take t = 24 n/b in (6.5) obtaining 


(6.6) _ fe oP)e tate? O08 2en a/b dy 


for all integral values of n. If a/b is rational, say a/b = m,/m, where m, and m, 
are relatively prime integers, (6.6) is contradicted by choosing n = m, and 
n = 2m,. On the other hand, if a/b is irrational, n may be selected such that 
the fractional part of na/b lies in (4, }) thereby rendering the integrand of (6.6) 
negative. 

To demonstrate the second part of (v), take t = 22 nin (6.5) obtaining 


l= [ exp {2e'n®(1 — o°) + 2nwi(@ — m/n)} du 


= | exp {2e'n*(1 — o°) + wi(2nj + 2k + €)} dy, 


where j, k, n, are integers and « = ¢(@) lies in [j, 3]. Since the real part of the 
integrand is non-positive on a set of measure one, there is a gross contradiction. 
If y = 66, set t = 2x nb after which the argument is the same. Q. E. D. 

For any c.d.f. G satisfying (8), let go(t, u) denote the corresponding c.f. In 
view of the domination of the tails of the @ distribution by those of (x), ge( Z, 0) 
is defined and convergent for all complex Z. Also, since o° has a bounded spec- 
trum, ¢o(0, w) is an entire function of w. But then (see e.g. Theorem 2 of 
[19]) ¢e(Z, w) is jointly analytic in Z and w. Thus, from (6.5) we see that 
¢a(Z, iZ’/2) = e 27/2 for all complex Z but this is insufficient to characterize 
Pa(Z.w)- : 

In the case of product measure, the class of measures y satisfying (5) is given 
by 

THEOREM 7: A product-measure mixture of normal distributions is normal with 
mean 6) and variance o if and only if for some ai in (0, oo), GAi(é) = 
(0; % , 03 — 01) and G,(a’) is degenerate at a. 

Proor: Sufficiency is obvious and subsumed in Theorem 5. To prove necessity 
note that, since G,2(@) is constant with respect to o’, (6) simplifies to 


(x; % 05) = Ger(x)* fF &(2z; 0, 0”) dG,(o’). 


By the theorem of Cramér-Lévy this requires that both factors be normal with 
moments which add to  , 05 . By the identifiability of %, , the second factor is 
normal if and only if G, is degenerate. 

Coro.uary: A mixture of normal distributions with identical means cannot be 
normal . 

It follows immediately from (ii) or (iii) of Theorem 6 that a finite mixture of 
normal distributions cannot be normal (recall that finite here signifies at least 
two). Furthermore, as a direct consequence of Theorem 7, a countable product- 


* This corollary and the fact that G, is degenerate also flow from Theorem 4. The former 
is also implicit in a theorem of [2]. 





68 HENRY TEICHER 


measure mixture of normal distributions is non-normal. It seems intuitively 
plausible that no countable mixture of normal distributions {@(z; 6;, o)} is 
normal and if the variances oj have a minimum, this is indeed the case. This 
follows from a somewhat more general proposition. 
Suppose that » is such that for some real % and a > 0, ufo’ |e < oi} = 0 
while 
ufo, | = = py > 0. 


Then from (6.5) for any yu satisfying (8) 


2 
t2/2 o t2/2+it6 o2t2 /2+i06 
e — Poe 0 ® +- e du 
8S 


( 2 2 

. |e = 05,0 ~ 0) 
where S = (6, a | 2 2 » 
jorg > a } 


2 2 
(1—o, )t? /2 it@ (o2~@7 ) 2/2440 
Thus, ¢ . = pe * 4 [e om dp. 
8 


Since both terms on the right hand side are (to within constants) c.f.’s, this 
would imply that a continuous c.d.f., namely, (2; 0, 1 — o>) was a mixture of 
a discrete and some other distribution, which is patently false. Thus, if a u-mix- 
ture of {@(2; @, o°)} is normal, the mixing measure cannot be as supposed here. 
In particular, if the infimum of the variances is attained, a countable mixture 
of normal c.d.f.’s is non-normal. If it is not attained, we may suppose that a 
subsequence of variances approaches zero, whence from (iv) of Theorem 6, it 
follows that the same conclusion holds if zero is not a value or a point of accumu- 
lation of {6;}. 

According to (v) of Theorem 6, a countable mixture of normal c.d.f.’s cannot 
be normal if the means are a set of numbers in arithmetic progression (or a sub- 
set there of); the case of only finitely many different means can be disposed of 
by a number-theoretic argument but the question of an arbitrary countable 
mixture remains open. 

Given any bounded sequence {oj} of (distinct) positive real numbers, say in 
(0, 1) and arbitrary positive ¢«, there exist sequences {6,}, {c;} with c; > 0, 
> Jat c; = 1 such that 


sup, | > cf(x; 0;,0;) — O(z) | <. 


This statement follows from Theorems 1 and 5 and shows that a countable 
mixture of normal distribution can be arbitrarily close to a normal distribution. 
The known relationship 


| . poe da = 1 ist 
0 L2./ra 2 


reveals that an exponential-mixture of normal distributions (with identical 
means) has the so-called Laplace distribution. 


5. Generation of mixtures. If (X, a) have a joint distribution in R™* then 





MIXTURE OF DISTRIBUTIONS 69 


H(z) = Jf F(x\| a) dG(a) and (dually) G(a) = f{ K(a| x) dH(x). Moreover, 
if for some measure v on R™ (independent of x), dK/dv = k(a| zx) exists, then 
the Radon-Nikodym derivative dG/dv = g(a) = f*.k(a| xz) dH(z) likewise 
exists and H is representable in the form 


(10) A(z) = raf ime ee g(a) dr = f 


| k(a| y) dH(y) 


“hay r o 
a 7 
g | k(a | y) dH(y) 


Since H is a c.d.f., du = gdv represents a probability measure. If, in addition, 
h(x) = dH/d) exists for some linear measure A, dF/d\ @f(r\|a) = 
k(a | x)h(x)/g(a) whence 


ka | z)h(x) 
g(a) 


When m = 1, @ concentrates on 0, 1, 2, ---- and »v is counting measure, the 
preceding reduces to 


(12) h(x) = DoPeogs(x\j) with f(x|j) = k(j| x)h(x)g;' 
and (dually) g; = Jf k(j| z)h(x) dd with k(j| 2) = gf(z\j){h(z)J". 


In particular, if k(@|z) = 1 for rela = (Ga, Ge4i1) and zero otherwise where 
& = —%, lima..d2 = +” and a, < dg4:, a = O, 1, 2, --+ then g(a) = 
H(@as:) — H(a,) and (10) exhibits any non-degenerate c.d.f. H as a finite or 
countable mixture of ¢.d.f.’s H, formed by truncating the distribution H out- 
side J, , If ay = + for some finite integer N > 0, the mixture is necessarily 
finite. This method of splitting apart and splicing together a c.d.f. is somewhat 
artificial but other choices of k(a | z) yield more interesting mixtures. 

A slightly altered formulation leading to (12) (or (11)) starts with the 
selection of non-negative measurable functions g;(x) and constants a; 2 0 for 
which g(x) = }-$-» a,g;(z) is positive and finite on a set S of positive Lebesgue 
measure. If A(z) is any p.d.f. with spectrum S and such that 0 < 
Ss gi(x)h(x)[g(2)J' dx = bj < ©, 7 = 0,1, 2, ---, then h is a c,-mixture of 
{f(z)| where c; = aj, and f(x) = g,;(z)h(x)|[bg(z)J"", ze S and zero else- 
where. Note that k(j|z) = a,g,(x)[g(z)J". 

An interesting particularization arises from taking g,(z) = |2|’, a > 0. 
Thus, if g(z) = >-}.a; | z |’ converges for x ¢ S and h is as just indicated, the 
prior conclusion holds with f;(z) = (| z\’h(z))/(b(xr)), ze S and zero else- 
where. 

Example 1: Gamma distribution as a negative binomial mixture of commonly 
scaled but differently exponented Gamma c.d_.f.’s. 

Let a > 1, X > O and choose g(x) = e*"*, A(z) = €*2* {F(A)]" on 


S = (0, ~). Thenb; = a **?T(A + j)[TA)]", ¢ = oa a’*(a' — 1)’ and 


(11) h(x) = ew | s(x | @) dy. 


eth! —az 
OF Ty Fp) on S. 


fz) = 
a 





70 HENRY TEICHER 


In the earlier notation, k(j |x) = 1/j!e"““""*[(a — 1)z]’. This example appears 
in [14] with A = n/2 and a change of scale. 
Example 2’: Select g(x) = o exp {(2°/2)(1 — o°)/a’}, 0 < o® < 1 and 
h(x) = ®'(x). Then 
(2j)! » 2) 73 
agi ay  \ye 7 ((1—°)/o*)’, dain = 0, 


whence 
=, 0 (23 asl Bilao)” .« tet = 
#'(x) = x (7) «(1-0 ————__. g" ¢ * P = Cc; (2), 
2 J (27)!o77 + 1 2d if 
expressing the normal density function as a discrete mixture of the bracketed 
densities f(x). 


The c.f. (see appendix) of f;(z) is 


H.;(t) o2t2/2 
g(t) = H2,(0) e ; 


ao0 


where 


a oj a” , 02/2 f wat 2) alt ons Co oe 2(j—1) 
He;(t) = e ?(—1) ai = > ( Deal 3 (2i—1)]t 


=O 


is one version of the Hermite polynomial of order 2). The preceding bracketed 
expression is defined to be one for i = 0. 

The restatement of the mixture in terms of ¢.f.’s yields an exponential expan- 
sion in terms of even degree Hermite polynomials, viz., 


( ee © _1)/ — 
. ‘exp (—f 2(? =") i > i ( 5") Hz,(t), 
fa 2 


\ 
which is virtually that of [12, p. 580). 
Example 3: (2) as a more elaborate mixture of norma! distributions. 
Let {Q:(x)} be a sequence of positive definite quadratic forms with > ie Q(z) 
= «, Take k(j|xz) = exp {—3)> 4, Qi(r)} — exp! a > Qi(xz)} > 0, 
j = 0, 1, «++ and define constants c; , 6; ,8;,7 2 Oby #’(2z) exp} —3) 4, Qi(2)} 
= ¢,/8;%'((2 — 6;)/s;). It follows that g; = ¢; — cj: > O, and letting d; 


= Cys (Cc; = Cj+), 
F(a\j) = (1 + d;)®(2; 0;, 85) d (2; 6541 ,8541)- 


Now |{s,} is decreasing and positive and if ¢; is sufficiently small, F(x | 7) con- 
tains the factor #(z; 0, o;) and is thus representable as 


F(x\j) 
(13) 


= (2; 0, o5)*{(1 + dj)®(a; 0;, 85 — 05) — d@(2; 0541, 8541 — 05)). 


’ The referee has pointed out that example 2 may be obtained directly from example 1. 
He has also suggested a reformulation of the main idea of this section, leading to greater 


cohesion. 





MIXTURE OF DISTRIBUTIONS 71 


If the o} are distinct, the bracketed term may be regarded as the value at «° = 0} 
of a conditional distribution function G,»(x) and the mixture }>}-sg; F(x | j) 
has the structure (6), viz., 


(xr) = > F~09 AP( 2/0 ;) 9G (x)] 
with G,(o’) discrete. 

In particular, if Q(z) = (2° — 2x — 2Ina) where 0 < a < e*, calculations 
yield 6; = j/(j + 1), § = 1/47 + 1) and d; = qi/(1 + qs) where q; 
= a((j + 1)/(j + 2))* expfg — 1/(2(7 + 1)(G + 2))}. Finally, if oj 
< (1 — a@e)/(j(1 — ae) + 2 — a’e), the bracketed quantity in (13) will be a 
c.d.f. 


6. Remarks on the compound poisson distribution. It seems of interest to 
note that the problem of mixtures of Poisson distributions is intimately linked 
to the moment problem. Since, as mentioned earlier, 


H(x) = [ SS dla) 
<2 


is a discrete c.d.f. with saltuses at the non-negative integers, it is completely 
characterized by the probabilities 


2 a’e a : 4 _ 
p= | “= dG(a), j = 0,1,2,-:- 
0 J: 


Let G*(a) = 1/po fg e * dG(y). Then G* is a c.df. and 


oe « 
VP? [ a’ dG*(a), j = 0,1,2,-:- 
Po 0 
Consequently, in order that a discrete distribution characterized by mass p; at j 
(j = 0, 1, 2, ---) be a mixture of Poisson distributions, it is necessary that the 
sequence {j!p;/po| be a moment sequence for the Stieltjes moment problem. In 
particular, it is necessary that the determinants (1, 7 = 0, 1, --- mn) A, 
= | (t + 7) pis; | on = (i+ J + 1) \pissai | be non-negative for every non- 
negative integer n. Conversely, if |7!p;/po} is a moment sequence on (0, ~ ) and 
the corresponding distribution G*(a) is such that Jf e*dG*(a) = 1/pp < @, 
then >>,-.p; is a compound Poisson distribution with mixing c.d.f. G(a) 
= po fe e* dG*(u). 

It is easy to see (and pointed out in [8]) that the class of compound Poisson 
distributions is identifiable. Thus, a mixture of Poisson distributions cannot be 
Poisson. 


APPENDIX 
We show by induction that the c.f. of f;(2) of example 2 of Section 5 is 


(t) = -: H;,(at) é 712 /2 
" H.f0) ~ ' 





72 HENRY TIECHER 


where 


25 j 92 
H2;(t) on e” /2( ~1)* Si etl a 2 (—1)* (3) (1.3... (2i = 1)jer 


is a version of the Hermite polynomial of order 27. The preceding bracketed ex- 
pression is defined to be 1 fori = 0. 

This is evident for 7 = 0 and (supposing « = 1 for simplicity) follows, for 
j = 1 from 


ve* e7? a en . . 
[Zee og LE) em coon 
LV 20 Qe dt’ 4/2 
where 7 denotes the operation of Fourier transformation. In general, it suffices 
to verify that for 7 2 2, 


Lon a ae —t2/2 
Now 


—?T (xe? = T | gig 2*!? 
dx? 


= Tire? — (45 + 1)2%e? + 25 (25 — 1)2* 8), 
Hence, if (14) holds for 7 — 1 and j(j 2 1), it holds forj + 1, since 


Tie?) = (—1) et — 49 — 1)Haj(t) — 2j(2j7 — 1)H25-2(t)} 
= (—1)*"Hajae(te” 
in view of the recursion relation (verified by direct substitution ) 


(f — 4j — 1)Haj(t) — 2j(2j — 1)Hay-a(t) = Hasaa(t), j 2 1. 


REFERENCES 

{1} Patrick BiLuines.ey, ‘Invariance principle for dependent random variables’’, Trans. 
Amer. Math. Soc., Vol. 83 (1956), pp. 250-268. 

{2} J. Buum, H. Cuernorr, M. Rosensuatt anv H. Tercuer, ‘‘Central limit theorems for 
interchangeable processes,’’ Can. Jour. Math., Vol. X (1958), pp. 222-229. 

[3] N. Boursakt, ‘Elements de mathématique, Livre vi, Intégration,’’ Chapter 5, Inté- 
gration des mesures, Hermann, Paris, 1957. 

[4] Herman Cuernorr, “Large sample theory: Parametric case,” Ann. Math. Stat., Vol. 
27 (1956), pp. 1-22. 

[5] H. Cuamér, ‘Uber eine Eigenschaft der normalen Verteilungsfunktion,’’ Math. Zeit. 
Vol. 41 (1936), pp. 405-414. 

(6) Jean Dieuponng®&, ‘‘Sur la convergence des suites de mesures de Radon,’’ Anais da 
Academia Brasiliera de Ciencias, Vol. 23 (1951), pp. 21-38. 

[7] Danie. Duev&, “‘Arithmétique des lois de probabilités,’’ Memorial des Sciences 
Mathématiques, fascicule CX X XVII, Gauthier-Villars, Paris, 1957. 

[8] W. Feuer, ‘On a general class of contagious distributions,’’ Ann. Math. Stat. Vol. 
14 (1943), pp. 389-399. 





MIXTURE OF DISTRIBUTIONS 73 


(9] Bruno pe Finertt, ‘‘La prévision, ses lois logiques, ses sources subjectives’’, Annales 
de l'Institut Henri Poincaré, Vol. 17 (1937), pp. 1-68. 

[10] Lawrence M. Graves, The Theory of Functions of Real Variables, McGraw-Hill, New 
York, 1946. 

[11] Epwin L. Hewitt anp Leonanp J. Savace, “Symmetric measures on cartesian prod 
ucts,’’ Trans. Amer. Math. Soc., Vol. 80 (1955), pp. 470-501. 

{lla} E. Cansapo Macepa, “On the compound and generalized Poisson distributions,’’ 
Ann. Math. Stat., Vol. 19 (1948), pp. 414-16. 

{1lb] Per Orrestep, ‘On certain compound frequency distributions,’’ Skand. Akt., Vol. 
27 (1944), pp. 32-42. 

{12} Harry Poutarp, “Distribution functions containing a gaussian factor,’’ Proc. Amer. 
Math. Soc., Vol. 4 (1953), pp. 578-582. 

{13] Hersert Rossins, ‘Mixture of distributions,”’ n. Math. Stat., Vol. 19 (1948), pp 

[14] Herpert E. Ropsins anv E. J. G. Prrman, ‘‘Application of the method of mixtures to 
quadratic forms in normal variates,’’ Ann. Math. Stat., Vol. 20 (1949), pp. 552-560. 

(15) Henry Scuerré, ‘‘A useful convergence theorem for probability distributions,’’ Ann. 
Math. Stat., Vol. 18 (1947), pp. 434-438 

[16] H. M. Scuwartz, ‘“‘Sequences of Stieltjes integrals, II1I1,’’ Duke Math. Journal, Vol. 10 
(1943), pp. 595-610. 

{17] J. A. SHonat anno J. D. Tamarxin, The Problem of Moments, Amer. Math. Soc., New 
York, 1943. 

[17a] GeorGe P. Sreck, “‘Limit theorems for conditional distributions,’’ Univ. of Calif. 
Pub. in Stat., Vol. 2, § 9. 12, pp. 237-84. 

[18] Henry Tercner, “On the convolution of distributions,’’ Ann. Math. Siat., Vol. 25 
(1954), pp. 775-778. 

(19] Henry Tercuer, “On the convergence of projected distributions,’ Ann. Inst. of Stat. 
Math., Vol. 9, No. 2 (1958), pp. 79-86. 

(20) Avre, WintNeER, “‘Stratifications of Cauchy's ‘stable’ transcendents and of Mittag- 
Leffier’s entire functions,’’ Amer. J. of Math., Vol. 80 (1958), pp. 111-124. 





ON INTERCHANGING LIMITS AND INTEGRALS 


Joun W. Pratt’ 
Harvard University 


One frequently wants to show lim ff, = flimf, ; that is, knowing f, — f 
pointwise, one wants to show ff, — Jf. Commonly used criteria are those of 
the Lebesgue (dominated or bounded) convergence theorem [1, Theorem 26.D; 
2, Theorem 7.2C; etc.] and Scheffé’s “Useful Convergence Theorem for Prob- 
ability Distributions” [3]. The following criterion sometimes applies more 
directly and is never much harder to apply. Informally, the criterion is that 
f, shall be bounded above and below by functions which converge pointwise 
and in integral; or, in other words, a convergent sequence permits exchange of 
lim and f if it is bracketed by two sequences which permit this exchange. Spe- 
cifically, 

THEOREM 1. /f 


(i) fu f, gn —> 9, Gr > G, 
(ii) gn S fn S Gy for all n, 
(iii) fon — Sg and §G, — JG with fg and JG finite, 
then ff. — Sf and ff is finite. 


(i) and (ii) may be interpreted as holding at each point and the integrals as 
ordinary (Lebesgue) integrals over a fixed (Lebesgue measurable) subset of the 
real line or k-dimensional Euclidean space. 

More generally, it is assumed throughout this note that all integrals are 
taken with respect to the same measure yu on a Borel field @, all sets mentioned 
are measurable, all functions mentioned are measurable from @® to the class of 
Borel sets, inequalities like (ii) hold almost everywhere [u), and convergence of 
functions, as in (i), is either almost everywhere [yu] or in measure [yu]. Proofs 
will be given for the case of convergence almost everywhere. The more general 
case follows, since every subsequence of a sequence which converges in measure 
has a subsubsequence which converges almost everywhere. 

In Theorem 1, Corollary 1, and Corollary 4, f may be replaced by fx for a 
fixed set B (and fs by fsnz) provided all integrals are so replaced. This is not 
a real generalization, being the result of substituting yu, for u, where 


i ( S) = u(B Nn S) 
for all S. 
Proor oF THEeoreM |: 0 S f, — g. ~f — gand0 3G, —f,—-G — f. We 
Received February 13, 1959; revised August 17, 1959. 
: This research has been supported in part by the United States Navy through the 


Office of Naval Research, under contract Nonr 1866(37). Reproduction in whole or in 
part is permitted for any purpose of the United States Government. 


74 





INTERCHANGING LIMS AND INTEGRALS 75 


can thus apply Fatou’s Lemma [1l, Theorem 27.F, etc.] (The Lemma says 
J lim inf h, <S lim inf fh, for h, 2 0.) We obtain: 


[s- [o | tir (f. — gn) S lim int | (f, — ¢@,) = lim int | f, - | o: 
[e _ [s = [iim (G,. — fa) & lim inf / (G, — f,) = [e — lim sup [ h. 


Therefore, lim sup ff, < Jf s lim inf ff, , so ff. — Sf, q.e.d. 

Corouiary 1. Jf (i)—(iii) hold and, in addition 

(iv) gx. SO S G, for all n, 
then 

(a) f\f.-—f\|—0; 

(b) Ssfa— Ssf uniformly in S (S measurable) ; 

(c) fhf. — Shf for all bounded functions h, uniformly in h for each bound. 

(a)-(c) are equivalent, as is well known, since it is immediate that (a) im- 
plies (c), (c) implies (b), and (b) implies (a). To prove (i)—(iv) imply (a), 
note that, by (ii) and (iv), 

Os|fa-—Sf\ S\fa\+\f\ SG. — gn t+ G — 9 > 2G — g), 
while, by (iii), [(G, — g. + G@ — g) — f2(G — g) which is finite. Thus Theorem 
1 applies with | f, — f | for f, ,0 for g, , and G, — g, + G — g for G, . (b) and 
(c) without uniformity are perhaps even more direct applications of Theorem 1. 

Conditions (i)—(iii) alone do not imply (a)—(c), nor can the region of in- 
tegration in the conclusion of Theorem 1 be different from that in (iii) in gen- 
eral. For trivial counter-examples, let f;\(x) = —1 for -—1 < z < 0, =1 for 
0 < x < 1,and =0 otherwise; let f = g = G = 0; let f,(r) = g,(z) = 
G,(2) = nf;(nx) or n'f,(n™'x); and let the integrals be ordinary (Lebesgue) 
integrals from —1 to l or — & to #. Then (i)-—(iii) hold but (a)—(c) do not, 
nor can the integrals in the conclusion of Theorem 1 be taken over positive x 
only. The choice nf;(nz) from —1 to 1 gives finite measure and the choice 
n ‘f\(n'x) from —« to » gives uniformly bounded functions. If the measure 
is finite and the functions are uniformly bounded, of course, f, — f implies 
(a)—(c) without further conditions. 

Theorem 1 and Corollary 1 reduce to the Lebesgue convergence theorem when 
G, =G= —-g = —g, 2 0. 

The next corollary is Scheffé’s theorem. 

Coro.uary 2. Jf all f, and f are probability densities and f, — f, then (a)-(c) 
hold. 

Proor. (i)-(iv) are satisfied by g, = g = 0,G, = f,,G@ = f. Thus, Corollary 
1 applies, q.e.d. 

Suppose P, and P are the probability measures given by the probability 
densities f, and f; that is, P,(S) = fsf., P(S) = Jaf. It is an immediate con- 
sequence of Corollary 2 that, in Euclidean space, f, +f implies P,, converges in 
distribution to P (in the usual sense that the c.d.f. of P, approaches the c.df. 





“a 


76 JOHN W. PRATT 


of P at points of continuity of the latter. ® must include Borel sets, but » need 
not be Lebesgue measure). It is more illuminating to compare consequences (b) 
and (c) of convergence of densities to a density with the following conditions, 
each of which is equivalent to convergence in distribution in Euclidean space. 
(This is well known; in fact, (c’) is often used to define convergence in distribu- 
tion more generally.) 

(b’) P,(3) — PCS) for every set S whose boundary has P-measure 0. 

(c’) fhdP, — ShdP for every bounded continuous function h. 

Provided open sets are measurable, (b’) and (c’) are obviously weaker than 
(b) and (ec). 

Coro.iary 3. A density which is continuous in a parameter has continuous (in 
fact, equicontinuous) power and Type II error functions. 

Proor. Suppose f(z, @) is a density function for each @ and continuous in 6 
for each xz (or, more generally, for almost all z at each value of 6). The power 
function of a test is a(@) = fh(x)f(x, 0) where 0 < h S 1 and the integration 
is over xz. Given any sequence 6, — @, Corollary 2 applies with f,(z) = f(z, @,), 
f(z) = f(a, 6), giving a(6,) — a(@) uniformly in h, q.e.d. 

Another consequence of Theorem 1 is that a function may be differentiated 
with respect to a parameter under the integral sign if its derivative is bracketed 
by the derivatives of two functions which permit differentiation under the 
integral sign. That is, 

Corouuary 4. Suppose @ is a real parameter. Let D denote differentiation with 
respect to @ and f integration over x. If is an interior point of an interval I and 

(i) Dg(a, 0), Df(x, 0), and DG(x, 6) exist for all 6 € I, 

(ii) Dg(z, 0) S Df(z, 0) S DG(z, 6) for all 6 € I, 

(iii) Dfg(x, 0%) = [Dg(x, %) and D§G(x, %) = JDG(zx, %) with all four 
quantities existing and finite, 
then Dff(xz, 0) = J[Df(x, 0) with both quantities existing and finite. 

This follows from Theorem 1, since the difference quotients of f lie between 
the corresponding difference quotients of g and of G. It suffices that (i) and 
(ii) hold for almost all z[y]. 

When G(z, 6) = —g(z, 6) = 0G(zx), Corollary 4 reduces to the commonly 
given criterion that | Df(z, @)| S G(x) ior G integrable. 

The .aain advantage of the approach presented here is its simplicity. From 
the point of view of application, Theorem 1 is a single theorem which applies 
with trivial specialization to the situations for which Lebesgue’s and Scheffé’s 
theorems are tailor-made. Furthermore, Theorem 1 implies the following facts, 
for instance, more directly than the latter theorems do. 

(1) If f. +f and f\f, | — J|f | finite, then f| f, — f|— 0. 

(2) If all f, and f are densities, f, — f, all h, and h are test functions (that 
is,O Sh, 51,05 hA S 1), andh,—h, then fh,f, — fhf. 

Pedagogically, I find Theorem 1 useful in reviewing measure theory briefly 
in a probability course. The most expeditious way I know to prove the Lebesgue 
Convergence Theorem is to prove Fatou’s Lemma first [2, Section 7.2, for in- 





INTERCHANGING LIMS AND INTEGRALS 77 


stance]. But the proof of Theorem 1 is a simple extension of a proof of Lebesgue’s 
Theorem from Fatou’s Lemma. Thus Corollaries 2 and 3 are obtained virtually 
without extra proof. Concepts not involved in the statement of the theorems 
(such as eyuicontinuity at the empty set) need never be introduced. 

In fact, it is interesting to note that the equivalence of Theorem 1, 
the Lebesgue Convergence Theorem, Fatou’s Lemma, and the Monotone Con- 
vergence Theorem (2, Theorem 7.2A, etc.] depends only on properties of meas- 
urable functions and f having to do with order and addition. Theorem 1 is 
especially natural in this context. 

The fact that the foregoing theorems have short proofs is fortunate for the 
purposes mentioned. However, it means the individual theorems are in this 
sense not deep, and it makes it hard to verify that any particular one is new. I 
have not searched the literature thoroughly, but I have never seen even a state- 
ment of Theorem 1 or Corollary 1, although they unify important theorems on 
interchanging lim and f, both conceptually and pedagogically. The only state- 
ment of Corollary 3 I know is Wald’s [4, p. 133], although it is obviously im- 
portant and frequently assumed tacitly. Corollary 4 I presume is new. 


REFERENCES 


[1] Paut F’. Hatmos, Measure Theory, D. Van Nostrand, New York, 1950. 

[2] Micue:, Lotve, Probability Theory, D. Van Nostrand, New York, 1955. 

[3] Henry Scuerré, “‘A Useful Convergence Theorem for Probability Distributions,”’ 
Ann. Math. Stat., Vol. 18 (1947), pp. 434-438. 

[4] Apranam Wa.p, Siatistical Decision Functions, John Wiley & Sons, New York, 1950. 





MOMENTS OF THE ABSOLUTE DIFFERENCE AND ABSOLUTE 
DEVIATION OF DISCRETE DISTRIBUTIONS' 


By S. K. Karri 


Towa State University of Science and Technology 


1. Introduction. Johnson [3], Crow [1] and Ramasubban [5] have discussed the 
evaluation of the mean difference and the mean deviation for some positive 
integral valued discrete distributions. These are particular cases of a more general 
statistic which may be defined as 


(la) 4, = E|X, — X2|', 


where X, and X» are two random variables with given distributions. Statistic 
(la) will be referred to as the rth moment of the absolute difference of X, and 
X, . In this paper, A, is evaluated when X, and X; are independent and both have 
distributions—possibly different ones—within one of the following families of 
distributions: (i) Poisson, (ii) Pascal, and (iii) Binomial. The case when X;, 
and X; are distributed as two independent Logarithmic variables, and the cases 
when X, and X; are independent and have distributions in two different families 
of distributions (chosen from the Poisson, Pascal, Binomial, and Logarithmic 
families), can be treated along similar lines, but the results are not given here in 
order to conserve space. Methods are also given to evaluate A, when X; is a 
fixed constant and when X;, is distributed as (i) a Poisson (ii) a Pascal (iii) a 
Binomial (iv) a Hypergeometric and (v) a Logarithmic random variable. In 
this special case, A, will be called the rth moment of the absolute deviation of 
X, about X, and denoted by 6, . I am investigating two sample tests, based on 
the sample analogues of the A,’s, that may be appropriate when the two samples 
are from two specified but different parametric populations. 


2. An expression for the rth moment of the absolute difference | X¥, — X, | 
Let X, and X, be two arbitrary independent positive integral valued random 
variables with probabilities P{” and P{” of obtaining X, = i and X, = i re- 
spectively. Then the rth moment 4, is given by 

A, = E | X; ~— Xo |" 


D KP{X: — Xi = k| Xi = JP[X, = 7 
tk 


+ DOW P{X, — X. =k |X: = JP{X: = ij 
tk 


DP YP + DOR PP PE. 
tk tk 


where the summations are over 1, 2, 3, --- 


Received February 24, 1959; revised June 27, 1959. 

1 This research was supported by the United States Air Force Contract No. AF49(638) -43 
monitored by the Air Force Office of Scientific Research of the Air Research and Develop- 
ment Command. 


78 





MOMENTS OF DISCRETE DISTRIBUTIONS 79 


3. Some applications of equation (1). 
(a) Moments of the absolute difference for two independent Poisson random 
variables. Let 


aa —* . (2 _sh tye 
Pi? = en /A}, PY = &*n3/i! 


- govtes ee KOs ra) 'AE FEO rz) ‘At | 


imo kao 2!(t + k)! = ES illi+ kt f 
=e '™2(A, + B,) say. 


(3) 


where we denote the term in curly brackets by Ao. It is apparent that A» can 
be written in the form 


(4) Ao = >> iF i(1;i + 152), 


imo (t!)? 


where :F;(a; 7; z) is a confluent hypergeometric function [2]. Operating on (4) 
by A2d/AXe yields 


(5) A, = (.. ~ *) Ao + ve’’*Fi(¥; 3; —4-v) + = ev Fh; 1; —4/2). 
2 2 


Successive application s times of the operator \20/4\,. leads to the recursion 
formula 


(6) Aw = > (‘)( + (—1)™ ") A, + (~1)" = e’Y".Fi(4; 1; —4V/v). 
2 


‘=o Az 


Ao can be calculated from formula (4), using the tables given by Nath [4] to 
get the values of the confluent hypergeometric functions involved therein, and 
then A, can be calculated by the repeated application of (5) and (6). Calculation 
of B, follows along similar lines. Equation (2) can then be employed to calcu- 
late A,. 

For the particular case when \; = Az = A, i.e. when X, and X; are Poisson 
variates with the same mean, A; reduces to 


On using the facts that the Bessel function of the first kind J,(x) [2] and the 
modified Bessel function of the first kind J,(z) [5] are given by 


= (—1 )' eS :. (5)e = 
=> 2 a 


= Fi(4 + v5 1 + 2y; 2% 
i— il(y + 2)! ri +1)° : i+» i a eae 





I,(z) = 2 il(n +4)!” 
we obtain 
(8) Ay = 2rhe™{Io(2A) + 1(2A)}, 


which agrees with formula (2.18) of Ramasubban [5]. 

(b) Moments of the absolute difference for two Pascal or two Binomial 
random variables. First, let X; and X2 be two independent Pascal random 
variables. Write 


i. me k+t-1 ~r\' (2) _ ke ke+t—1 3) 
pr = a (HAE NB) pe = ar (42 )(2). 


From (1) (by changing k to 7), we have 


2S (ki ti— 1\(ko +i +7 — 1\(pi v\ (pr 
ho —ky *{ % ' \( > . \az " 
qm De &/ i i+j M G2) \q 
(9) A, (M+s43—1)(e+5— 1mm ey} 
+22 i+j i MG / \h 


= qi 'g2 (A, + B,), say. 


In order to simplify A,, we now write Pip2/qig2 = and p./g. = ». Conse- 
quently, we can write 


Bs @V JSS ifat+i-1\(/eat+i-1 
wand) {EaC NON) 


(10) 


- Fike + t, 134+ ism) } 


where 2F';(a, b; c; z) is a hypergeometric function [2]. Denote the term in the 
curly bracket by Ao . A recurrence formula for calculating A,¥4,; is 


_ : 8 0 mi ke ky vy 
tn £()44 (od) [rts Aa} 


a\' 1 
+ my ky oF (ke, ki + 131; »1) (2) (—1 ). 
Ove vo — Vi 


Since the value of Ay can be computed to any degree of accuracy from its formula, 
the quantities we need to know before we can use (11) are 


(» 2) | + Ss | and ( *) Lo for all s. 
Ve 1 — vw Yi — Ve 0 a 


Since the derivatives with respect to » of the functions involved are relatively 
simple, we will give here a method for expressing (»20/dv2)‘f in terms of (0/»2)‘f. 


(11) 





MOMENTS OF DISCRETE DISTRIBUTIONS 


First, we observe that 
a 2a 
(12) (> 2\ se, nai taal 


It follows that (»20/dv2)*f has the form 
(13) Pe a; in(2 Vs f, 


where a” are constants, not involving f. To evaluate aS”, we note that 


(14) (2) ‘5+ ¥ Gal? + ato (2 2) fm Sal" (2 Vs, 


with the convention that a{” = 0 for i < s. This yields us the set of recurrence 
formulae 


(15) ae"™ an ia\” + a\”; ; 


It is evident that aj = 1. Hence (15) can be used successively to obtain the 
various values of aS” and then (13) used to obtain (»2(0/dv.) )’f. 
The calculation of B, follows along similar lines. 4, can then be calculated 


from (9). For the particular case when X; and X; are Pascal random variables 
with the same parameters, i.e. 


nenome(tin (i 


A; reduces to 


te) —~ 
hee (= 
1 q q 


{iF (+ lk+ 15%) + (e+ Bari (k+2k+152:%)}. 
¢ q ¢ 


Upon using the relation (see [2], eqn. (36) pp. 113) that 
F(a,b;a— b+ 1;Z) 
= (1+ VZ)™ Fila, a — b + 4; 2a — 2b + 1;47Z(1+ VZ)”), 
we obtain 
(18) A; = 2kpq. Fi(k + 1, 4; 2; —4pq) 


which agrees with formula (2.12) of Ramasubban [5]. 
Suppos~ now that X, and X; are Binomial random variables with 


Pi = C ‘) i qi’ and Pi? = ee pigi* 


It is apparent that the formulae for this case can be obtained from those given 


(16) 


(17) 





82 8. K. KATTI 


for the Pascal case by changing the k’s to (—n)’s, p’s to (—p)’s and the quanti- 


ties (' + ; ™ ') to (—1)' ("). Hence no separate discussion need be given 


here for this case. 


4. A method to evaluate the moments of an absolute deviation. Let XY be an 
arbitrary positive integral valued random variable and let P; denote the proba- 
bility of obtaining X = i. Then the moment generating function (m.g.f.) of 
| X — m| where m is a fixed constant is given by 


o [m] 
(19) m(t) = Ee"! = De ™P, + De" P,. 
i=(m]+1 1=0 
Here, |m] is the largest integer, less than or equal to m and the second term is 
considered zero when m < 0. Since, for m < 0, moments of |X — m| are the 
same as those of (X — m), we will consider only the case when m > 0. Equation 
(19) can now be simplified to 


“x x 


m t ik gon “Pp, a mt ree, mt c e'P, 
(20) safalas i F sofas , a 2, 
= y(t) — ¥(—t) + &“M(—t) say, 
where M(t) is the m.g.f. of X and wherein y(t), the first term in (20) may be 
referred to as the incomplete m.g.f. of X — m. Since the even moments of 
| X — m| are the same as those of X — m and hence obtainable by using the 


regular statistical techniques, we will consider only the moments of odd order, 
say 50,4, . On differentiating (20) (2r + 1) times and setting t = 0, we have 


(21) Borg, = 2¥°*?(0) — E(X — m)**. 


As remarked above, calculating of the second term poses no new problem. Our 
task therefore reduces to that of obtaining y*” (0). 


5. To obtain the value of y‘”*'’ (0) for some particular cases. 
(a) The Poisson distribution: Let P, = e*‘/i!. Then, from the definition 
of y(t) 
~*~ —h, i 
(22) soe FS eet. 


' 


i=—[(m)+1 $3 
which can be written in the form KG(0, t), where 


—r 
e yim 


92) ¢ SB: <andteeemedeneee= 
(23) K = ml +1)!’ 


(24) G(0, t) = exp {t({m] — m + 1)},F:(1; [m] + 2; de‘). 


In order to obtain a convenient method to evaluate y**” (0), let us define 





MOMENTS OF DISCRETE DISTRIBUTIONS 


(25) G(a,t) = exp {t({m] — m+ a+ 1)}:Fi(a + 1; a + [m] + 2;A¢'). 
Differentiation of (25) yields 


oe (i) ae aa ’ Ma + 1) ¥ 
(26) G'’(a,t) = (lm) — m + a + 1)G(a,t) + a+ tal 3 oO + 1,2). 
Successive differentiation of (34) s times at t = 0 gives the recurrence formula 
G°* (a, 0) = ({m] — m+ a + 1)G“’(a, 0) 
Ma + 1) 
a + |m) + 2 
Since G(a, 0) = .Fi(a + 1; a + [m] + 2; A) is the confluent hypergeometric 
function, we can obtain G(a, 0) for a = 0, 1, ---, 2r + 1 by referring to the 
tables of Nath [4]. G®’*” (0, 0) can then be calculated by the repeated applica- 
tion of (27). Calculation of y”*”(0, 0) immediately follows, since as can be 
easily seen, ¥”*” (0,0) = KG®*” (0, 0). 
(b) The Pascal, the Binomial and the Hypergeometric distributions: For the 


Pascal distribution, 
nae) 
+ q 


By proceeding along lines similar to those in (a), we can write y(t) in the form 
YH(0, t) where 


o eGinee 


and 


27 
an G"’ (a + 1,0). 


H(a,t) = exp {t(m] + 1 — m + a) oF; 
(29) 


(i+ tm) +a 41,1 + a; lm) +a + 2;2e). 


For the Hypergeometric distribution 


(30) ne (7) Fe )/ (1) 


and 
v(t) = ZF(0, t) 


2 = (\nie1) (n-n)-1)/ (2): 


F(a, t) = exp {t([m] — m+ 1 + a)jsFi(—n + [m] + 1+ a; 
— Np + [m) + 1+ a,1+ a; [mj] + 2+ a,Nq— n+ [m) + 24+ a; 6’), 





84 8. K. KATTI 
and ;F2(a, 8, y; 6, ¢; Z) is a generalized hypergeometr': function (cf. (2]). The 
recurrence formulae for obtaining H®*” (0, 0) and F®*” (0, 0) are 

H°*” (a,0) = (a — m + [m] + 1)H (a, 0) 


pa(k +1 + [m) +a) 
q a+ [m) +2 


(34) 


+ H (a + 1,0) 


and 


pq, 9) = (om + ml + 1+ a) (—Np + Im] +1 + a)(1 + a) 
(35) : ({m] + 2 + a) (Nq + n + [m| + 2+ a) 


‘F(a + 1,0) + ([m] + a — m)F“’(a, 0) 


respectively. Calculation of y*” (0, 0) for the two cases follows from its rela- 
tionship with H°*” (0, 0) and F®*” (0, 0) which can be computed by the suc- 
cessive application of (34) and (35). 

The formulae for evaluating y”*” (0) in the case of the Binomial distribution 
can be obtained from those in the case of the Pascal distribution by making the 
changes suggested in 3(b). 

(c) The Logarithmic distribution: For the Logarithmic distribution, P; = 
ar‘/i and —a log (1 — r) = 1. Hence 


2 dé 
(36) ot) =e) = DX. 
i—(m)+1 1 
In order to obtain a convenient method to compute ¥‘”*” (0), we first observe 
that 
(37) oe” (t)(1 ik re’) ‘at ar'™ +1) tim] +1) 


This leads to the recurrence relation 
e—l 
(38) 60) = Ear im) + 1)" +E (1) 90), 
a: ¢ im \t 


After having calculated ¢“ (0) for s = 1, 2, --- , 2r + 1 by using (38), y"*” (0) 
can be calculated by using the recurrence formula 


2r+1 
yrt)(9) = > “? ') ¢(0)(—m)*"-*, 
1=0 
which can be easily derived from (36). 


6. Conclusion. The general expressions given in sections (2) and (4) can be 
employed to obtain methods for finding the moments of the absolute difference 
and absolute deviation for some well known distributions. It was shown in section 
(3) that the formulae for A, involve as particular cases, the results obtained by 
Ramasubban [5]. It can be easily shown that the formulae in section (5) also 
lead to his results when we set r = 0. This has been left out of the discussion for 
brevity. 





MOMENTS OF DISCRETE DISTRIBUTIONS 85 


In conclusion, the author would like to thank Mr. J. N. K. Rao for bringing 
this interesting problem to his attention, Dr. John Gurland for his keen interest 


and helpful guidance during the development of this work and a referee for 
useful comments. 


REFERENCES 


[1] Epwin L. Crow, “The mean deviation of the Poisson distribution’’, Biometrika, Vol. 
45 (1958), pp. 556-559. 

[2] A. Erpe.y1, Higher Transcendental Functions, Vol. 1, McGraw-Hill Book Co., New 
York, 1953. 

(3] N. L. Jonnson, “A note on the mean deviation of the Binomial distribution”, Bio- 
metrika, Vol. 44 (1957), pp. 532-533. 

[4] Pran Natu, “Confluent hypergeometric function’, Sankhya, Vol. 11 (1951), pp. 153- 
166. 

[5] T. A. Ramasuppan, ‘The mean difference and the mean deviation of some discon- 
tinuous distributions”, Biometrika, Vol. 45 (1958), pp. 549-556. 





PRIORITY QUEUES! 


By Rupert G. Mier, Jr. 
Stanford University 


1. Introduction and summary. In a priority queue different types of items 
(individuals or elements) arrive at a service mechanism and each item has a 
relative priority for order of service. Let there be K classes of items, 1, 2, ---, K. 
If the service mechanism is to select an item for service, a type 7 item will be 
selected in preference to a type j item for i < j even if the type j item arrived 
before the type 7 item, and within each class the ‘‘first come, first served” policy 
determines the order of service. When a type j item is in service and a type i 
item arrives (¢ < 7), there are two primary disciplines for handling the priority 
demand. The “head-of-the-line” discipline allows the type j item to complete 
service but places the type 7 item ahead of any other lower priority items. The 
“preemptive” discipline withdraws the type j item from service and replaces it 
by the type 7 item. Under the preemptive scheme the only time at which a 
type j item (1 < j) can be in service is when there are no items of types 1, ---, 
j — 1 in the queue. When a lower priority item which has been preempted re- 
turns to service, the preemptive discipline must distinguish two cases. The 
“preemptive resume” policy allows the preempted item to resume service at the 
point at which it was preempted so that its service time upon reentry has been 
reduced by the amount of time the item has already spent in service. The “‘pre- 
emptive repeat” policy requires the preempted item to commence service again 
at the beginning. A priority queue with an indifferent server is of course a special 
case of the preemptive resume discipline. 

In the special case K = 2 the type 1 items will be referred to as priority items 
and the type 2 items as non-priority items. 

It will be assumed throughout this paper that the input process for type 7 
items, i = 1, ---, K, is Poisson with arrival rate \,; and the input processes 
operate independently. The service time distribution for a type 7 item (in iso- 
lation) will be denoted by F's, and unless explicitly stated to the contrary will 
be assumed to be general subject only to the restrictions Fs,(0+) = 0 and 
E(S,;) < @. Let pi = \<E(S,), and let 8, be the Laplace-Stieltjes transform of 
F's, . The service mechanism consists of a single channel or server. 

A. Cobham [1], [2] introduced the head-of-the-line priority queue and derived 
equilibrium expected waiting times. Subsequent contributions have been made 
by Holley [3], Kesten and Runnenburg [4], [5], and Morse [6]. The first published 
results for the preemptive discipline were by H. White and L. 8. Christie [7], 
and additional results have been presented by Stephan [8]. Koenigsberg [9] has 


Received October 31, 1958; revised June 2, 1959. 


! Technical Report No. 6, prepared under contract Nonr-225(28) for the Office of Naval 
Research, Reference No. NR-047-019. 


86 





PRIORITY QUEUES 87 


generalized the priority model to a continuous number of priority types with 
application to machine breakdown problems. 

Under various assumptions in this paper the following quantities have either 
been cbtained explicitly or characterized as the unique (subject to regularity 
conditions) solution to a functional equation: the generating function for the 
stationary probabilities on the number of priority and non-priority items (K = 2) 
in the queue, the Laplace-Stieltjes transforms of the waiting time distributions, 
the Laplace-Stieltjes transform of the distribution of a busy period, and the 
generating function for the probabilities on the number of items serviced during 
a busy period. For most of the distributions mentioned the first two moments 
are computed. 


2. Stationary distributions of the number of items in the queue (K = 2). For 
completeness the results of White and Christie [7] and Morse [6] are summarized 
briefly. 

For Poisson arrivals and exponential service the queue process is a continuous 
time parameter Markov process. If P is a stationary distribution of the queue 
process, it must be a solution to the forward steady state equations which, 
symbolically, can be represented as PA = 0 where A is the infinitesimal matrix 
of the process. If the system of equations PA = 0 has a unique solution (subject 
to the condition it be a probability distribution) and a stationary distribution is 
assumed to exist, then algebraic manipulation of the equations PA = 0 will 
yield a characterization of the stationary distribution or its generating function. 

For the preemptive priority queue with K = 2 White and Christie employ 
this method to obtain the generating function and thereby the first and second 
moments. Let yw; and yw: denote the service rate parameters for the priority and 
non-priority items, respectively. Justification of a non-priority Poisson servite 
process with the parameter uy, from assumptions on the non-priority service 
process in isolation is discussed in detail in [7] with regard to the resume and 
repeat disciplines and the indifferent server queue. 

If p: + p2 > 1, the queue will become saturated with items and no stationary 
distribution will exist so the equilibrium condition p; + p: < 1 is assumed. 

Let P,» be the stationary probability that there are n priority and m non- 
priority items in the queue and P(s,t) = >a. Ps.98°"t”. That P,,.. is uniquely 
defined is evident from inspection of the equations. 

9 . eet 1031 (1 — pr — pr)ue(f* — 1) 

G1) IO eel = = MD ll = Fe 

where 

(22) a(t) = A+ Ml —-t)+m—-V mt Ae( 1 — t) + my)? — Aa 
1 


The moments of the number of priority items in the queue are the same as 
those for priority items in isolation; e.z., 


1 


9.2 
(2.3) E(n) = resis E(n’) = 291 + | pi 
“2 


a= a * =) 





88 RUPERT G. MILLER, JR. 


The moments of the number of non-priority items in the queue can be evaluated 
from (2.1). 


wm) = [—2&—][#(-25) +1]; 


(24) E(m’) = — 2oi(s/i)" + a st E + =/ 2 )] 
(1 — pi)*(1 — pr — 2) = (1 — pr — rn)? mw: \l — px 
4 ol — pr)’ + pi(As/mr)® + pr(1 — pr) (1 — pr + 2) (A2/m) 
(1 — pi)*(1 — pi — p2)* ; 
For a head-of-the-line priority queue with exponential service ( , uz) Morse 
({6], Ch. 9) has derived the generating function and first moments of the sta- 
tionary probabilities through the same technique. 





Het) wo i -F -a-a) 
Pl) = <a paul —-) + el 3) 
Pao{t) [Ar + A2(1 — t) + wallan(1— s*) — wed — £*)) 
[Ar(L — 8) + Ag(L — f) + pal[Ar(l — 8) + Ao(l — 2 +u(l — sD)’ 


(2.5) 
+ 


where 
Pxy(t) = 


t{l — pr — pallAr + ACL — t) — proe(t))[(ur — we) (Ar + AoC — t) + pe) 

(2.6) — Ar mil 
[Ar + Ag(L — &) + wmal[Ar wr 6 — (ur — we)t(Ar + Ao(1 — f) + we) 
+ po(ui(l — a(t)) — we)] 


(2.7) E(n) = —* E + nl]. 
2" Ay Me 


(2.8) E(m) = » + ——— [ at alin) 


M1 — A Ll — op — pe 


The previous technique is not applicable for a head-of-the-line priority queue 
with Poisson arrivals but non-exponential service since the number of items in 
the queue no longer has the Markov property. The Markov property can be 
restored by reducing the continuous time parameter process to discrete time. 
This technique was introduced by D. G. Kendall ([{10}, [11]) and has been 
utilized by others (cf. [12], [13]). A discrete time Markov process is generated 
if the queue process is observed only at those points in time which are the 
termination points of a service period—priority or non-priority. The state of 
the queue is (n, m) where n is the number of priority and m the number of 
non-priority items in the queue (at the end of the service period). Since both 
priority and non-priority arrivals are Poisson the discrete time process has the 
Markov property. 

The behavior of the Markov chain “imbedded” in the continuous time process 





PRIORITY QUEUES 89 


is determined by the transition probability matrix which is expressible in terms 
of pi; = probability that i priority and j non-priority items arrive during a 
priority service period and q;; = probability that i priority and j non-priority 
items arrive during a non-priority service period. 


é j 
Pi = fo ernw Ae (As t) 


j! 


P(s,t) = Do pigs't! = S,0u(1 — #) + (1 — #)), 


a < [ eo orate (A: t)* On" aPs,(t) 


u! 


Q(s, t) = > qi’? = S((1 — 8) + Ao(1 — 2)). 


Let P{(n, m) — (n’, m’)} be the probability the queue moves from state 
(n, m) to state (n’, m’) in one transition. In terms of the p,;, q:; , the 


P{(n, m) — (n’, m’)} 


(1) P{(n, m) — (n’, m’)} = Oforn’ <n — 1, > 1,all_m, m’, 
(2) P{(n, m) — (n’',m’)} = Ofor m’ < m,n 2 1, all n’, 
(3) P{(n, m) ~— (n - Il + i,m + j)} - Pis for i, j = 0,n Pa 1, all ™, 
(4) P{(0, m) — (n, m’)} = O form’ < m — 1, alln, 
(5) P{(0, m) — (i,m — 1+ j)} = qu fort,j 2 0,m> 0, 
(6) P{(0,0) — (4,7)} = mpis + raqus fort, 7 & 0, 
where 7; = Aix/(Ai1 + As) and rz = Ae/(Ar + Az). The transition probabilities 
under (6) have their special form because if the state is (0, 0) the queue is next 
observed at the end of the service period for the first arrival so the probability 
of the new state (n’, m’) depends on which type of item was first to arrive. 
The state (0, 0) is ergodic if the equilibrium condition p, + p: < 1 is satisfied. 
Since the proof of this is analogous to those in [13] and [14] for different queues, 
it will be omitted. The ergodicity of the state (0, 0) guarantees the existence 
of the stationary distribution for the imbedded Markov chain. 
The stationary probability of there being n priority, m non-priority items in 
the queue will be denoted by ,... By definition, the stationary distribution 
x = {x..| must satisfy the system of equations 


(2.10) Tam’ = >. FanP{ (n,m) —> (n’, m’)}, — all n’, m’. 





90 RUPERT G. MILLER, JR. 


From (2.10) 
(2.11) w(s,t) = DD wamP{(n, m) — (n’, m’)}s” t™ 


which, when simplified, gives the following expression for +(s, ¢): 
w(8,t) = [woo(riP(s, t) + reQ(s, t) — €'Q(s, t)) 
+ mo(t)(£'Q(s, t) — 8 *P(s, t))\[l — s*P(s, t)T", 


where a(t) = ee Tomt” is analogous to Px(t) of (2.5). 

To determine x(s, ¢) it is necessary to determine x(t). This can be accom- 
plished by imbedding a seccnd Markov chain within the original imbedded 
Markov chain. The second Markov chain is defined by taking cognizance of the 
state of the process only at those time points which are the termination points 
of a service period leaving 0 priority items in the queue. The state of the queue 
is (m), the number of non-priority items in the queue. This Markov chain is 
imbedded within the original chain since a trial for the second chain occurs at 
the end of a service period only if there are 0 priority items left in line whereas, 
previously, the termination of any service period constituted a trial. 

Let P{m — m’} denote the probability of moving from state m to state m’ 
in one transition. For m > 0,7 2 0, 


(2.12) 


P{m—>m—1+4+j} = P{m—m — 1+), no priority arrivals in interim} 
+ P{m—m — 1+), priority arrivals in interim} 


 _aythgu (lr 4)” Cre u)' 
e qua — on 
n! l! 


; Ys (ro 0)?" (ns) »| 
lf e G—pi ws (v) |. 


Fs is the distribution of the busy period (see [15] or Section 4) for priority items 
in isolation and F§"? is its n-fold convolution. If 


ar su) | 


P(t) = > Pim m — 14+ jit) 
2=0 
for m > 0, it is readily verified that 


(2.13) P(t) = 8:(u(1 — BOra(l — t))) + (1 — #)) 


where B is the Laplace-Stieltjes transform of Fs . 
For m = 0,7 2 0, 


P{0—j} = P{0—), first arrival is priority} 


dF ,(u) 


° j 
+ P{0—j, first arrival is non-priority} = 7 [ e*™ (re u) 
0 ; 





PRIORITY QUEUES 


i—0 l! 


. ~ —Aqe (r2 v)** Fy . |. 
[fer Fyre 


where F is the distribution which concentrates its total mass at 0. If 


Q(t) = do Pio — jie’, 
=O 


then 
(2.14) Q(t) = mBO(1 — t)) + reel — BOo(1 — t))) + (1 — £)). 


Let x, be the stationary probability of there being m non-priority items in 
the queue (for the second imbedded Markov chain). Algebraic manipulation of 
the system of equations 


(2.15) ry = > ehPim—>m'}, all m’ 


yields the following expression for #°(t) = > So xt”: 
(2.16) w(t) = wlQ(t) — CPI — CPO. 


x) determines the normalization for the distribution x’. If the second Markov 
chain is to be viewed as imbedded within the first, the proper normalization is 
3) = 1 Which implies 75 = om for all m. Hence, 


(2.17) wo(t) = wol(Q(t) — €'P(t))fl — e'PC)y". 
In conjunction with (2.12), (2.17) yields 
r(s,t) = wxoll — 8 P(s, t)] “{nP(s, t) + 7:Q(s, t) — €°Q(s, t) 
+ (€°Q(s, t) — s"P(s, t)) [1 — EP) "1Q(t) — £*P()}}. 


x is determined by the restraint (1, 1) = 1; ro = 1 — po: — pe. 
The first moments of n and m (and also higher moments) can be calculated 
from (2.18). 


(2.18) 


Alm: ECS{) + 12 E(S:)\_ 


(2.19) E(n) = ni(o+ m2) + 2(1 — pr) 


E(m) = 12(p. + p2) 


(2.20) 4 dale E(Si) + rz E(S3)] [ent + uw) + all — a» — a) | 
2u1 (1 — pi — px)(1 — pu) 

From (2.18)—(2.20) it is apparent that (2.18) does not agree with (2.5) when 
the service times are exponentially distributed. As more complex queues are 
studied, it becomes clear that the stationary distributions for the imbedded 
Markov chain and general time ¢ are identical only for the simpler queues. For 
example, a similar discrepancy is noted in [13]. 





92 RUPERT G. MILLER, JR. 


It might be hoped to duplicate this analysis for the preemptive priority queue. 
However, there does not exist a natural imbedding procedure for the preemptive 
queue. The only method of avoiding incorporation of an additional time quantity 
into the definition of the state would be to observe the process just at the termina- 
tion of service of a non-priority item. But this is a one-dimensional queue and is 
of no significance. 

For those one-dimensional queues in which the arrival distribution is general 
and the service exponential the natural imbedding considers those times at which 
a new arrival enters the queue. For a priority queue with general arrival distribu- 
tions this imbedding pattern leads to a non-Markov process unless the time to the 
last arrival of the other type item is incorporated into the definition of the state. 
The addition of this extra time variable prohibits any simple analysis. 


3. Waiting time distributions. The waiting time of an item is defined to be the 
length of time the item must wait in the queue before it is taken into service. The 
time an item spends in service is not included in the waiting time. For a priority 
queue with head-of-the-line discipline the time in service of an item is jus ‘he 
length of its service period, but for the preemptive discipline the term “‘time in 
service” will mean the total time from the moment the item first enters service 
to the moment it completes service including those periods of time in which it is 
waiting for reentry into service after having been preempted. 

The equilibrium condition 1 — p,; — --- — px > 0 will be assumed through- 
out this section so that it is meaningful to discuss stationary distributions. The 
discussion for general time ¢ also applies to the transient case. 

The method introduced by D. G. Kendall for the single class queue can be 
applied to derive the Laplace-Stieltjes transform of the steady state waiting 
time distribution for a priority item in a head-of-the-line priority queue (K = 2). 
Suppose an item has just completed service. Since the queue is assumed to be 
operating in a state of equilibrium, with probability 7, the item was a priority 
item and with probability +. the item was non-priority. If the item was a pri- 
ority item, the number of priority items remaining in the queue must be the 
number which arrived during its waiting time and service period. If the item 
was non-priority, the number of priority items in the queue is just the number 
which arrived during its service period. The probability there are n priority 
items remaining in the queue is >-*o tam SO 


= a "at (Mn t)” * 
> Tan = nf e — dF», * Fs,(t) 


m=( 


(3.1) 


+ T2 I et OO" aPoA(t), 


where Fy, is the waiting time distribution for a priority item and Fy, * F's, de- 
notes the convolution of Fw, and Fs, . From (3.1) 


~ 7.) _ (Or — 8)/M,1) — 72 82(8) 
(3.2) Wi(s) = meena Tecra 





PRIORITY QUEUES 93 


where W, is the Laplace-Stieltjes transform of Fy», . Substitution of the value 
of #((A. — 8)/., 1) as given by (2.18) gives 


(3.3) W,(s) = an aa aden dl = Sho | 


The moments of W, can be calculated from (3.3). In particular, 


hs E(Si) + 2 E(S3) : 
2(1 — pr) 
‘ ds E(St) + de E(S2) , ds E(Si)Pa E(Si) + 2 E(S3)] 
(3. E(Wi) = = ee Bo 
35) (W;) 31 — p) + 2(1 — pi) 

The transform of the non-priority waiting time distribution can be obtained 
by adaptation of the results of L. Takacs [15]. For a simple, single class queue 
with Poisson arrivals (\) and general service distribution Fs Tak4cs established 
that the Laplace-Stieltjes transform W(s; t) of the waiting time distribution at 
time ¢ is given by 


(3.6) W(e;t) = ee*o-son E - | eve A0- BO) gts ay) au |, 
0 


where S(s) is the Laplace-Stieltjes transform of Fs and F»(0+; u) is the 
probability the queue is empty at time u. The Laplace transform of F»(0*; u) 


is related to B(s), the Laplace-Stieltjes transform of the busy period distribu- 
tion, by 

on a —su + 1 
(3.7) l e PAO 5s) & = TK 
The transform of the steady state waiting time distribution is 

1 — \E(S) ‘ 

1 — A(1 — §(s))/s 

The Waiting time of a non-priority item is the sum of two waiting times, W? 
and W:”. W: is the time required to service all priority and non-priority items 
already in the queue at the arrival of the non-priority item, and W:° is the 
time consumed in servicing all subsequent priority arrivals which precede the 
entrance into service of the non-priority item. As far as the waiting time of the 
non-priority item is concerned, the following queue discipline could be in effect 
at its arrival. Service all priority and non-priority items in the queue ahead of 
the non-priority item at its arrival. Any priority arrivals occurring during this 
time interval are refused service until the items initially in the queue have 
been serviced—even if this means servicing a non-priority item in preference to 
a priority item. After the initial group has been serviced, commence service on 
the by-passed priority items and continue service until the queue has been 
emptied of priority items. At this moment the non-priority item whose waiting 


time is in question may enter service. W: * is defined to be the service time for 
the by-passed priority items. 


(34) E(W,) = 


(3.8) W(s) = lim W(s;t) = 





94 RUPERT G. MILLER, JR. 


If n priority items arrive during the W? units of time, the distribution of W?* 
is the same as the distribution of B, where B, is the length of a busy period 
(see [15] and Section 4) for a one-dimensional queue with only priority items 
in which there are n priority items initially. Thus, 


(3.9) P{W:s 2} = | b gt: Gey PiB, Sz- y |apws sy}, 
n=( 

and 

(3.10) W.(s) = Wr(s + (1 — B,(s))). 

W? ic obtainable from (3.8) with the identifications \ = A; = \; + dz and 

S(s) = SI(s) = 7:5,(s) + 7252(s). To a non-priority item arriving at the queue 

the distinction between previously arrived priority and non-priority items is 

immaterial. All are serviced ahead of the non-priority item and could just as well 

be viewed as having the average service time distribution 53(s). Hence, 


. 1 — pi — pr 
G0) = —__—X——_—_—_——,, 
(3.11) > adi — Si(s + 1 — By(s)))) 
Be De ; 
8+ Ai(l — B,(s)) 


Moments of W2 can be determined from (3.11) and the results for B,(s) in 
[15] or the next section. 





on or \ E(Si) + mE (St), 
013) E(Ws) mg xt pi) (1 = pr)’ 
E(w?) = 26S) +» E(S:) , [E(Si) +. E(S:)/ 
3(1 — pi)?(1 — pi — pe) = 2(1 — pr)*(1 — ~ — pr)? 
E(Si)D E(Si) + 2 E(S2)) 
30 — pi)* (1 — pi — pr) 

Kesten and Runnenburg ((4], [5}) have obtained an alternative characteriza- 
tion of W, and W, . The first two moments as computed by their method agree 
with (3.4)-—(3.5) and (3.12)—(3.13). In addition, Kesten and Runnenburg have 
derived a characterization for the transform of the steady state waiting time 
distribution of any type j item for a priority queue with general K. 

The method employed to characterize W2(s) above can be extended to charac- 
terize the waiting time transform of the lowest priority, type K item for arbi- 
trary time ¢ and in equilibrium. Let Wx(t) denote the waiting time for a lowest 
priority item if it were to arrive at time t. Wx(t) is the sum of two components, 
Wx(t) and Wx’ (t), which are defined analogously to W? and W:”. The same 
argument verifies that 


P\We(t) sx} = [ | Se ttt Ory)” 
0 "1, 


' 
+++. ny. 


(3.13) 
4% 


(3.14) 
_ One —" c—3 


+ P{Br-t:0;--0¢-, 5 2 — y\ jane Sy} 





PRIORITY QUEUES 95 


where Bx_4-n,-.-¢_, i8 the length of a busy period for a priority queue with just 
K — 1 types 1, ---, K — 1 which commences with n; type i items, i = 1, ---, 
K — 1, in line initially (see Section 4). In terms of Laplace-Stieltjes transforms 
(3.14) becomes 

(3.15) Wx(s;t) = Wale + Acall — Brs(s)); 0), 


where Bx_; ,(8) is the transform of the busy period distribution for the K — 1 
dimensional priority queue which commences with a single type ¢ item in line 
and Se. (8) = > ton riBx 1.4(8). Ak 1 = Ai + se + An and T; hi/ Ana, 
‘= ,K—-1. 

ti ane and Bx _,(s) will be characterized in the next section, and Wis; t) 
cap be obtained from (3.6) with the identifications 

= Ac =, + =~ + Xx, 
(3.16) S(s) = Sk(s) = rSi(s) + +> + reSe(s), 
B(s) = x(8) = r1Bey(s) +--+ + reBux(s). 


In the limit ast— « 


(3.17) Wa(s) = lim Wa(s;t) = Wile + Aca(l — BEA(s))), 


t+oe 
where 
l coo my 
Wrla) = a 


(3.18) ot - 5(a)l 
_ 


Moments of the steady state waiting time can be computed from (3.17)- 
(3.18) and the results of the next section. 


m 
2 d, E(S*) 


-EA)(-E4) 


> a, ECS?) 
E(Wx) = - - 


(ENCE) 


x 
[> NE sy | 
+ TT 


(3.19) E(Wx) = 


bis E(S?) l[E h, E(S; | 


(1 En) (1-En)  20- Fa) (En) 


This technique can also be applied to the preemptive ‘‘resume”’ priority queue 
to characterize the waiting time distribution for any type item at general time 


a 





96 RUPERT G. MILLER, JR. 


t and in equilibrium. Let there be K priority classes and W(t) be the waiting 
time for an item in the jth class if it were to arrive at time t. The distribution 
of W,(t) is the same as for type one items in isolation since priority items pre- 
empt any lower class items in service. The waiting time W(t), 7 > 1, consists 
of two components, W};(t) and W;*(t). W;(t) is the time required to service 
all items of priority Sj which are in the queue at time ¢, and its Laplace-Stieltjes 
transform is given by (3.6) and (3.7) with A = A,;, S(s) = S}(s), and 
B(s) = By (s) = mBa(s) + --- + 75B,(s). 

B,.(s) and B}(s) for the preemptive resume discipline will be characterized in 
Section 4. The distribution of W}(t) is unaffected by the presence of lower 
priority items because of the preemptive discipline, and since an item ‘“‘resumes”’ 
service after preemption, the priority discipline among the items of types 1, ---,7 
could just as well be abandoned as far as the distribution of Wj (t) is concerned. 
W}*(t) is the time required to service all arrivals of priority <j which arrive 
after ¢ but before the type j item can enter service, and it is given by a convo- 
lution of busy periods B;,,;, i = 1, ---,j — 1, where the degree of the convo- 


lution is determined by the number of arrivals in the time interval (0, W} (t)). 
Hence, 


P{Wi(t) < x} = [| > trea...” 
Byer Mia m4. Nj-r: 

(3.21) 

*P(Bparny..-nj.. 22 — y) Jari Sy}, 
and 
(3.22) W (8; t) = WF(s + Ayal — BF(s)); 28). 
For the stationary case 
(3.23) Wis) = limW,(s;t) = W7(s + Ava(l — Beu(s))), 


where 


ey.) | hw A 
(3.24) se (= St) 


a 
The first two moments of W; are given by (3.19) and (3.20) with K replaced 
by j. 
The quantity 7; , the “‘time in service” of a type j priority item, is S; only 
for 7 = 1 under the preemptive resume discipline. For 7 > 1 


P{T; < x} = [| Sg ti-w Quy)” |, Oran)” 


a ro ny. n;1! 


(3.25) 


*P{ Ba mye Mjt sir- vn | aFs,(u) 





PRIORITY QUEUES 


so 

(3.26) T,(s) = 8(8 + Asa(l — BF(s))). 
The first two moments of 7’; are 

E(S;) 
j-1 


aay re 


E(S}) ___,_- E(S,)( E(St) + «++ + 41 E(S})I 


Ze] [1-2] 


The preemptive priority queue with indifferent server is a special case of the 
preemptive resume priority queue so the above results apply as well to the 
indifferent server queue. The waiting time questions for the preemptive repeat 
priority queue are for the most part still unsolved. 


(3.28) E(T%) = + 


4. Busy period distributions. A queue is said to be “busy” or “empty” depend- 
ing upon whether or not there is an item in service. The length of a busy period 
is the length of time between the arrival of an item at the empty queue and the 
first subsequent moment at which the queue is again empty. The technique 
which will be used to characterize the busy period is an adaptation of that 
introduced by Takacs [15] to solve the busy period problem for the simple queue 
with a single class, Poisson arrivals (A), and a general service distribution F’, . 


TakAcs established that B(s), the Laplace-Stieltjes transform of the busy period 
distribution, satisfies the functional equation 


(4.1) f(s) = S(8 + AC — f(s))) 


and is in fact the unique solution to (4.1) which satisfies in addition 


(4.2) (i) f(s) analytic for Re {s} > 0, (ii) lim f(s) = 0. 
« real 


Consider a priority queue with K priority classes and head-of-the-line disci- 
pline. Bx; is the length of a busy period which commences with the arrival of 
a type i item, i = 1, ---, K. Fs,, will denote the distribution function of Br, 
and Bx,(s) the corresponding Laplace-Stieltjes transform. Bx is the average 
busy period in which the priority class of the initial arrival is not specified. 
Ps = iF en, + °°: + teF ox, , and Bi(s) = 1Be(s) + «+: + reBee(s). 

The equilibrium condition 1 — p,; — --- — px > O will be assumed so that 
F’s,, is a bona fide distribution. With modification the discussion applies as well 
to the transient case. 

Arrivals at the queue constitute a Poisson process with parameter A, , and 
given that an arrival has occurred the probability it belongs to priority class j 
is r;. At the end of the service period of the initial arrival there will be n, , ---, 
nx items of types 1, ---, K, respectively, in the queue. The busy period will 





GS i RUPERT G. MILLER, JR. 


be prolonged by the amount &x.,,...., which denotes a busy period commencing 
with nm, -:*, m« items of types 1, ---, K, respectively, in line initially. How- 
ever, the distribution of Bx.,.,., is just a convolution of the distributions 
Fox, «++, Fog where F;;') denotes an n,-fold convolution of Fs,, . Hence, 


' 


n= n. mycteomg MW. Me. 


(4.3) 


“(7)” ++ (re) *P{ Bain,..ng BT 1 |aragtu), 


and 
(4.4) Bi(s) = Sk(s + Ag(l — Bx(s))), 


where F's, = riF’'s, + «+: + rxF's, and Sx is its transform. 
Bz(s) is the unique solution to the functional equation 


(4.5) f(s) = Sk(s + Ag(1 — f(s))) 


which satisfies the regularity conditions (4.2). The proof of this assertion can 
be obtained from the proof for the simple queue [15] with the appropriate identi- 
fications, but an alternative proof is presented below. This proof represents a 
simpler proof of the result for the single class queue. 

It is sufficient to show that (4.5) and (ii) determine f(s) uniquely for real 
8s > 0 since by (i) this determines f(s) in the whole half-plane. Suppose there 
exist two functions f;(8) and f.(s) which satisfy (4.5), (i), (ii) but fi(s) 4 fo(s) 
for real s > 0. Let & > O be a point for which fi(s%) > fe(s.) > O. 
Since lim,.. fi(s) = 0 and f; is continuous, there must exist an s, > & such 
that fi(s:) = fe(%) = c. But this implies that there exist two different values, 
namely s and s;, which satisfy c = Sk(s + Ax(1 — ¢)) which is impossible 
since the right-hand side is a strictly decreasing function of s. 

This proof can be extended to characterize Fs;(+ ©) in the transient case. 

Moments of Bz can be computed from (4.4). 


: , E(Sx) 71E(S,) +++: + trE( Sx) 
Ac E(B) = ~ ; 
( ) nx) | AxcE(Sk) 1 - Gy se +) eee 


(4.7) E( Bt’) ie E(S%’) Ti E(Sj) - a re E(Sx) 


(i — A, ESD) = pal 


i = fm «es 

The characterization of By, will be constructed recursively. Assume that 
Fag_,¢,t = 1,°-+, K — 1, and the corresponding B,_,.; have been determined. 
As far as the distribution of the busy period is concerned, the priority discipline 
can be disregarded. If the busy period commences with the arrival of a type 7 
item, the queue discipline could just as well stipulate that after this item has 
been serviced no other type i items will be serviced until the queue no longer 
contains other type items. Let Hx, be the distribution of the time required to 
empty the queue of items other than type 7 (including the service time of the 
initial i item). If in the time required to clear the queue of non-type i items 





PRIORITY QUEUES a9 


n type ¢ items arrive, the busy period is prolonged by an n-fold convolution of 
busy periods Bg; . Hence, 


(4.8) Fa,,.(2) = (ld Mew a FS2 (2 - y) | atau), 
or 

(4.9) Bus) = Ans + (1 — Bai(s))). 

But H.«, is given by 


Hx(y) = [ Pa 5 =" he “Opto ate gate thy 


"i Mind Miat 


ay) | Ova) Orsay | Onay)"™ 


ny! ny1! Nias! ne! 


Fa Fae ia FR ly — :)| dF 5,(2), 


where ,Bx_,,; denotes a busy period for a queue with K — 1 priority classes, 
the ith class of the original K classes being absent, and a type j item in line 
initially. (4.10) implies 


(4.11) fig(s) = Side + :Ans(1 — BE4(s))), 


where ;Ag.; = Ag — XX, + = > asdig Bass dies. (4.9) and (4.11) to- 
gether yield 
Bais) = Shs + dl — Beils)) 
(4.12) 
+ Agiil(l - Bis + Addl — Bais))))). 

Bx; is in fact the unique solution to the functional equation (4.12) subject 
to the regularity conditions (4.2). The proof of uniqueness is analogous to the 
previous proof for By . 

The moments of Bx; are derivable from (4.12). In particular, 


_ ES) 
[gm foe © Pe 
E(S)){1 — ¥ p)| + E(S)[Y 2», £CS})| 

rt} jm4 ; ; 


——_—_; + 

['- Fo] 

The distribution of the busy period for a preemptive resume priority queue is 
identical to the busy period distribution for the same queue with head-of-the-line 
discipline. The order of service is immaterial to the busy periud as long as pre- 
emption does not increase the time spent in the service mechanism which is the 


case for a resume discipline. Hence, all the previous results for head-of-the-line 
discipline apply as well to the preemptive resume queue. The indifferent server 


(4.13) E(Bx,;) = 


’ 


(4.14) E( By.) = 





100 RUPERT G. MILLER, JR. 


queue is included as a special case of the resume discipline. A similar identifica- 
tion cannot be made for a preemptive repeat priority queue since the length of 
time a lower priority item spends in the service mechanism is no longer equal to 
the service time in isolation. To date no characterization has been obtained for 
the busy period of a preemptive repeat priority queue. 


6. Distribution of number of items serviced during a busy period. Takacs in 
{15] characterized the distribution of the number of items serviced during a 
busy period for the simple queue, and this method can be adapted in a fashion 
analogous to Section 4. As before, the equilibrium condition 1 — p,; — «+--+ — 
px > 0 will be assumed so that all the distributions which will be discussed have 
total variation one. The reader can easily modify the discussion to cover the 
transient case. 

Consider a priority queue with K priority classes and head-of-the-line disei- 
pline. Let fx:(j) be the probability that a total of j items, irrespective of class, 
are serviced during a busy period commencing with a single type 7 item in the 
queue, and let fx(j) = rifm(j) + --: + rxfxx(j) be the probability of servicing 
a total of j items where the class of the initial item is unspecified. fx;(s) and 
Fr(s) will denote the generating functions of {fx.(j)} and {fx(j)}, respectively. 
For a specific class i let fx;(j) be the probability of servicing j type 7¢ items in 
a service period which commences with a type i item in line initially. fx;(s) will 
denote the generating function of {fxi(j)}. 

The determination of fx(s) will be treated first. Let px a ...»¢ be the proba- 
bility that during the service period for the initial unspecified item n; type j 
items, j = 1, ---, K, arrive. Since the initial item is type j with probability 7, , 


(5.1) Priny-ng = l | Ee ant And (n1)"---(ra)"* | ai (t), 
and 


Px(a, +++, 8x) = _ Pr oan <a 
hy OK 


= 5: (as (: 4 Yrs), 


By an argument analogous to that employed in Section 4, the fx(j) and fri(j) 
can be shown to satisfy the relations 


fr(1) = Pro..0; 
(53) 6G) = 2e Pemme 5 De Sain) --- 
5.3) 


Pie ** WK Jit I Kang 
O<Enysi-l 


(5.2) 


Jit ‘+I Kage i—1 


‘frs(jin,)* > Seu(ims)>*-Sew(inax); 


This yields for the generating function 


(5.4) Fr(s) = 8 Sk(Ax(1 — fx(s))). 





PRIORITY QUEUES 


Fr(s) is the unique solution to the functional equation 
(5.5) f(s) = sSx(Ax(1 — f(s))), | #| $1, 
subject to the regularity conditions 

(i) f(s) analytic for |s| s 1, 
(ii) f(0) = 0. 


The proof of uniqueness is omitted since it follows either from the simple queue 
proof [15] or from an argument similar to that for the busy period in the pre- 
vious section. 

The moments of Nx , the total number of items serviced during a busy period, 
are obtainable from (5.4). In particular, 


] l 
1— AcE(S%) 1 —p—-+-+ — px’ 
_ARE(Ss') 2g B(Sz) 
(1 — Ax E(S%))* © (1 — Ax E(S®)* © 1 — Ax E(S®)* 

The property that the number of type 7 items serviced during a busy period 
is independent of the priority discipline makes it feasible to obtain a functional 
relation for /x:(s). Let pxi-n be the probability that n type 7 items arrive during 
the time it takes to service the initial type ¢ item and then clear the queue of 
other type items without admitting any type 7 items into service. 


(5.6) 


E(Nx) = 
(5.7) 
E(Nz ) = 


(5.8) Pui:n -[ € at et dH x,(t), 


and 
(5.9) Ps) = Le Prind” = Agr -_ 8)), 


where Hx,(t) and Ax,;(s) were defined in (4.10) and (4.11). Under the disei- 
pline of servicing the initial type i item and then clearing the queue of the 
other class items the fx:(j) must satisfy 


Sxi(1) = Prin, 


(5.10) fei) = Pein D _ Sein) Selin), 5B 2, 


it’ *++iq™I— 


80 


(5.11) Jai(s) = 8 Ax dsl — Jus(s))). 
The proof that fx:(s) is the unique solution to (5.11) subject to the regularity 
conditions (5.6) is omitted. 

The first two moments of Nx, , the number of type 7 items serviced during a 
busy period commencing with a type i item, are determinable from (5.11). 





RUPERT G. MILLER, JR. 


i~ > 03 
B(Ng:) = -~—S 


1— > p;’ 
i 


NE(SD[L — J oil + NES) [IT VE(S)| 
(6.19) BNE) = ——— iam 


fl = > > pil 


, tal -Yol 1- De, 


2% 1*% 


Ts Sel 1-Les 


The distributions of the total number of items and the aumber of type 7 items 
serviced when the initial arrival is of type j can be determined by forming the 
appropriate convolutions of service periods and busy periods with the distribu- 
tions already determined in this section. 

As in the case of the busy period distributions the above results for head-of-the- 
line discipline apply equally as well to the preemptive resume priority queue. 
This also includes as a special case the priority queue with indifferent server. The 
corresponding distributions for the preemptive repeat priority queue still remain 
to be determined. 


6. Acknowledgments. I would like to thank Professor Samuel Karlin for sug- 
gesting this research topic and for generously contributing his help and guid- 
ance. My thanks also to Mr. J. H. Kullback for checking part of the algebraic 
acrobatics. 


REFERENCES 


{1| ALAN CoBhiaM, = riority assignment in waiting line problems,’ J. Opns. Res. Soc 
Am., Vol. 2 (1954), pp. 70-76. 

{2} ALAN CospnaM, ‘ “Priority assignment—a correction,’’ J. Opns. Res. Soc. Am., Vol. 3 
(1955), p. 547. 

{3} Junaan L. Howey, “Waiting line subject to priorities,’’ J. Opns. Res. Soc. Am., Vol. 
2 (1954), pp. 341-343. 

\4| H. Kesten ann J. Tu. RUNNENBURG, “Priority in waiting line problems,’’ Proc. Akad 
Wet. Amst. A, Vol. 60 (1957), pp. 312-336. 

(5) Tuomas L. Saaty, “Résumé of useful formulas in queueing theory,’’ J. Opns. Res 
Soc. Am., Vol. 5 (1957), pp. 161-200. 

6] Putuip M. Morse, Queues, Inventories, and Maintenance, John Wiley and Sons, New 
York, 1958. 

\7] Harrison Waite anp Lee 8. Curistie, “Queueing with preemptive priorities or with 
breakdown,’ J. Opns. Res. Soc. Am., Vol. 6 (1958), pp. 79-95. 

[8] Freperick F. Srernan, “Two queues under preemptive priority with Poisson arrival 
and service rates,’’ J. Opns. Res. Soc. Am., Vol. 6 (1958), pp. 399-418. 

{9} Exnest Koeniospera, “Queueing with special service,’’ J. Opns. Res. Soc. Am., Vol 

(1956), pp. 213-220. 

{10} Davin G. Kenpau., ‘Some problems in the theory of queues,’’ J. Roy. Sta’. Soc. B., 
Vol. 13 (1951), pp. 151-173 

{11] Davin G. KenvaLt, “Stochastic processes occurring in the theory of queues and their 





PRIORITY QUEUES 103 


analysis by the method of the imbedded Markov chain,’’ Ann. Math. Stat., Vol. 
24 (1953), pp. 338-354. 

{12} Davin M. G. Wisnarrt, “‘A queueing system with x* service-time distribution,’’ Ann 
Math. Stat., Vol. 27 (1956), pp. 768-779. 

{13} Rupert G. Mivuer, Jr., “A contribution to the theory of bulk queues,”’ J. Roy. Stat. 
Soc. B., to be published. 

{14} D. V. Linney, ‘The theory of queues with a single server,’’ Proc. Cambridge Philos 
Soc., Vol. 48 (1952), pp. 277-289. 

[15] Lasos Takics, “Investigation of waiting time problems by reduction to Markov 
processes,’’ Acta Math., Acad. Sci. Hung., Vol. 6 (1955), pp. 101-128. 


: 





ON TIME DEPENDENT QUEUING PROCESSES 


By J. Kernson anp A. KooHARIAN 
Sylvania Applied Research Laboratory, Waltham, Massachusetts 


1. Introduction. It is well known that the general class of stochastic processes 
with discrete states in continuous time arising in queuing theory, birth-death 
processes, etc., can be characterized as Markov processes provided the full set 
of random variables needed to specify the state of the process is employed. A 
detailed illustration of this approach is given by Cox in [1]! for the case of a 
queue in equilibrium subject to a random (Poisson) arrival distribution and a 
general service time distribution. Our object in this paper is to initiate a syste- 
matic development of this approach in the theory of queues. It turns out that 
such a development for the time dependent version of the above described queu- 
ing problem requires analytical considerations not encountered in the equi- 
librium case. Similarly the systematic development of this approach for the 
queue with a general arrival distribution (as well as a general service time dis- 
tribution) leads to a still different type of mathematical problem (simultaneous 
Wiener-Hopf integral equations with an analytic side condition) which we in- 
tend to report on elsewhere. 

One final remark is in order concerning the formulation of the approach and 
derivation of the governing differential equations carried out in sections 2 
and 3. While there is a general similarity between the arguments in these sec- 
tions and those, for example, in [1], we prefer to give a self contained discussion 
in order to exhibit how the additional complications arising from the considera- 
tion of time dependence can be incorporated in the general approach. 


2. Phase space. We assume a Poisson arrival distribution with a mean rate 
of arrival \ and that the service time x between an admission and completion 
is specified by an arbitrary probability density D(x). 

The state of the entire system (queue and service operation) at time ¢ is 
specified by the number, m, of people in queue and the elapsed time, x, of the 
person currently in service. Our phase space I’, accordingly, will be two dimen- 
sional with one discrete dimension consisting of the non-negative integers (queue 
lengths) and one continuous dimension consisting of the positive reals x. The 
state of the system is then characterized by a point in T. For completeness 
there should be a single additional point in T corresponding to the state of total 
vacancy of the system. 

We can now introduce the probability density W,,(2, ¢) on T for the prob- 
ability that at tine ¢ the queue length, excluding servee, is m and the elapsed 
time in service is x. It is worth emphasizing that the characterization of the 
state of the system by means of the set of probability densities W,,(z, t) is 

Received July 6, 1959; revised August 27, 1959. 

‘We are indebted to the referee for bringing Cox’s work to our attention. 

104 





QUEUING PROCESSES 105 


well defined independently of the queuing discipline. Thus whereas the queue 
length m is essential in our characterization of the state of the system, nothing 
whatsoever is implied with respect to the discipline governing selection from 
the queue for servicing (e.g. first come first served, completely random, etc.). 
It can be shown, nevertheless, that queue-discipline dependent aspects of the 
system such as waiting time distributions are deducible from the W,,(z, ¢) 
when the queuing discipline is of the first come first served or random selection 
type (see [1], for example). 


3. Analysis. The derivation of the difference-differential equations for the 
W,.(z, t) employs elementary continuity arguments concerning the motion of 
the system in I’. Consideration of the continuity of the flow during a time in- 
terval (t,t + A) leads to the equations 


Walz + A,t + A) 
= W,,(z,t)(1 — AA)(1 — n(2z)d) + Walz, t)Ad, m= 1,2, ---, 


to first order terms in A. Equation (3.1) is the basic relationship connecting the 
state of the system at a time ¢ + A to those at time ¢ from which the present 
state is attainable in phase space by the occurrence or nonoccurrence of arrivals 
and departures in the interval A. The interpretation of »(z) in (3.1), accordingly, 
is similar to that of A; i.e., n(2)A is the first order probability that a service 
completion occurs in the interval (z, z + A) conditioned on the system having 
reached the state z. The relationship of 9(z) to D(z) is given by 


(3.1) 


(3.2) D(z) = 9(2) exp { - [war 


Rearranging terms in (3.1), dividing by A and taking the limit as 4 — 0 in 
(3.1), we obtain, for m = 1, 2,---, 
OW, , OW 


(3.3) at + Ox 


+ [r + n(x)|W,, = AW 1 


as the governing partial differential equations in the interior of T. For m = 0 
we similarly find 
(3.4) OWs . OWe . ty + a(2)1M = 0. 
at Ox 
One additional probability, E(t), that describing the completely vacant state 
of the system, must be considered. Again a continuity argument similar to that 
leading to (3.2) yields 


(3.5) . +r E(t) = ouce t) de. 


In order to complete our mathematical description, it is necessary to specify 
(a) some initial state {W,.(z, 0); m = 0, 1, 2,---} and #(0) from which 
the system starts at t = 0, and 





106 J. KEILSON AND A. KOOHARIAN 


(b) the boundary conditions on the boundaries of T. ((3.3) and (3.4) are 
partial diiferential equations). 

As will become clear in the subsequent analysis, the solution of the set of 
equations (3.3), (3.4) together with appropriate boundary conditions will have 
a linear dependence on the initial conditions. It is no restriction, therefore, to 
consider the problem for initial states of the form 


(3.6) W.(z,0) = bnwd(z — 2), m = 0,1, 2, --- 


, 


where 5,» is the ordinary Kronecker’s delta and 6(2 — 29) the “delta function.” 
In words (3.6) corresponds t» starting the system with a specific queue length 
N and elapsed service time 2 . 

The derivation of the boundary conditions, on the other hand, requires a 
consideration of the motion of the system in [ when a completion occurs. If 
the system is in the state (1, m + 1) and experiences a completion, it drops 
directly into the state (0, m). Let us consider, therefore, the quantity 


4 
(3.7) P,(t) = l Wn (2, t) dx, 


which is the probability that the system be located at time ¢ in the set of states 
S,:(0, m) to (A, m). If we restrict ourselves to single transitions in the time 
interval (t, ¢ + A), the following considerations determine P,,(t + A): 

(a) A system in the state (2, m + 1) at time ¢ may experience a completion 
so that at ¢ + Ait lies in S,,, 

(b) A system at (2, m) for z > O at time ¢, on the other hand, cannot lie 
in S, att + A, 

(c) Similarly systems at (7, m — 1) at time ¢ cannot lie in S,, at ¢ + A, 
since x is unaffected by arrivals. 

Hence continuity requires that to the first order in A 


Palt + 4) = ASS Wasilz, t)n(2) dz, me 1,2, ++. 


Expanding P,,(t + 4) and keeping only first order terms in A, we obtain the 
boundary condition 


(3.8a) W,.(0, t) = [ W nar (2, t)n(x) de, m= 1,2 


The previous argument must be modified for m = 0, since a system lying in 
the empty state which experiences an arrival in (t, ¢ + A) also finds itself in 
So at time ¢ + A. In this case we obtain 


(3.8b) Wo(G,t) = [ Wiz, On(x) de + XB(0. 
0 
The set of equations (3.3-3.6) and conditions (3.8) then provide a complete 


description of the queuing problem posed above. It is to be observed that if the 
equations of motion (3.3-3.6) are integrated over ail z and summed over m 





QUEUING PROCESSES 


taking account of the boundary conditions (3.8), it follows that 


d Pr 
4 (ew + xl W,.(2, t) ac) = 0, 


which expresses the conservation of probability. 


In order to facilitate the analysis of this system we introduce the generating 
function 


(3.9) G(s, z,t) = > s"W..(2, t). 
m=O 


In terms of G(s, x, t), (3.3) and (3.4) condense into 


(3.10) aG + og + [A + n(xz)|G = AsG, 
at ox 


(3.5) becomes 


(3.11) + E(t) = I n(x)G(O, x, t) dz, 


while the boundary conditions (3.8) combine into 


(3.12) sG(s,0,t) = I n(x)G(s, x, t) dx + AsE(t) — I n(x)G(0, x, t) de. 


Our first step in the analysis of the system (3.10)—(3.12) is to make the sub- 
stitution 


(3.13) G(s, 2z,t) = H(s, 2, te" 
in (3.10), where we define N(x) = fi (y) dy. (3.10) then reduces to 


(3.14) x + — +(1 — s)H =0, 


which has the general solution 
(3.15) H(s, 2,t) = Ho(s,t — xe’. 
The addition of (3.11) and (3.12), and the use of (3.13) and (3.15) leads to 


(3.16) dE 
tf 


7 + \E(t) + sH,(s, t) = I D(x) His, t — ze" dx + ABE(t), 


where (3.2) has been used in the integrand. The problem has thus been re- 
duced to the determination of Ho(s, t) and E(t) for t > 0 with only a single 
integro-differential equation, (3.16), available. Actually there is a second dis- 
tinguishing fact about H, deriving from its relationship to G, (3.13), which is 
required to possess the analytical structure of a generating function. This latter 
fact leads to the analyticity condition discussed in Section 4. 

The unknowns in (3.16), Ho(s, t) and E(t), differ significantly with respect 





108 J. KEILSON AND A. KOOHARIAN 


to their dependence on ¢; namely Z(t) has no meaning for ¢ < 0 whereas 
H,(s, t) does. In fact from (3.13) and (3.15) we have 


(3.17) G(s, 2,0) = Ho(s, —z)e So ie 


for x 2 0. Thus for negative values of t, Ho(s, t) is known explicitly in terms of 
that part of the initial conditions corresponding to 


G(s, z,0) = >> s"W..(2, 0). 


m= 
Using the specific choice of {W,,(z, 0)} in (3.6), we obtain for z > 0 
(3.18) Hyo(s, -z) = We" 3" (x — 2). 


In so far as the analysis of (3.16) is concerned, the decomposition of Ho(s, t) 
into its known and unknown parts splits the integral into the sum of a known 
inhomogeneous term and a convolution integral involving Ho(s, t) for t > 0 
only. 

For simplicity we shall continue the analysis for the special case of the sys- 
tem starting from the completely unoccupied state, i.e., G(s, 7, 0) = O and 
E(0) = 1. In this case the inhomogeneous term vanishes so that (3.16) becomes 


(3.19) « +A(1 — s)E(t) + sHels, t) = I D(x)Hi(s, t — x)er”* de. 


If we take the Laplace transform of (3.19), and adopt the notation of using 


lower case letters for the Laplace transforms of capital lettered functions, we 
find 


(3.20) [p+ All — s}le(p) = [d(p + All — 8}) — shho(s, p) + 1, 


or equivalently, 


as _ Ip + A{1 — s}le(p) — 1 
(3.21) ho(s, P) = dip + Ail a s}) oh ‘ 


4. The analyticity condition. Since we may properly restrict ourselves to 
that class of possible solutions G(s, z, t) which are 1,(0, ~) inzfor0 Ss 3S 1 
and ¢ 2 0, it follows that Ao(s, p) must be analytic in the right half plane 
Re(p) > 0. If we consider the possible singularities in ho arising from the roots 
of the denominator in (3.21), i.e. the set of points p, satisfying 


(4.1) s = d(p, + {1 — 3}), 


it is possible to show that there is a continuum of roots—the positive real p 
axis—in the right half plane. In view of the preceding remark, therefore, e(p) 
must be chosen so that the numerator of (3.21) cancels these roots of denomi- 
nator. This argument specifies the values of e(p) on the positive real axis which 
together with the fact that e(p) must be analytic in Re(p) > 0 serves to uniquely 
determine e by analytic continuation in Re(p) > 0. 





QUEUING PROCESSES 109 


In order to give an explicit representation of e(p), it is necessary to determine 
under what conditions the function p, , defining the locus of roots of (4.1) as s 
runs from 0 to 1, has an inverse s, . That is we seek the conditions under which 
there exists a solution s, of the functional equation 


(4.2) 8, = d(p + dAf1 — 8,}). 


This equation, interestingly enough, has previously arisen in the study of a 
rather distinct problem in the equilibrium theory of queues; namely, the study 
of the distribution of occupation times of the server ((2], [3]). The function s, , 
when it exists, is actually the Laplace transform of this distribution. We give 
an analysis of the significance of this identification with respect to the time 
dependent theory in the appendix. In [2] theorem 6, it is shown that there is a 
unique analytic solution of (4.2) under the condition A/n < 1, where 


y= v | rD(x) dx 


is the mean rate of service. This is the familar stability condition in queuing 
theory. Under this condition e(p) can be explicitly given as 

—— — I —_ 
p+rA{l — s,} 

In general s, and, therefore, e(p) will require branch cuts in the p plane in 
order to be well defined for Re(p) < 0. We shall illustrate the nature of the situa- 
tion by considering a specific case in Section 6. 

An alternative integral expression for E(t) may be obtained by utilizing the 
transformation of variables suggested by (4.2) itself. Indeed using the trans- 
formation 
(4.4) u=p+ Xl — 8,) 
in the usual inversion formula for e(p) yields 

[Ad(u) +u—A] t , 
(4.5) EW = 5. f° — Od'(u) + 1) 
2x1 


u 


(4.3) e(p) = 


du, 


where it is easily shown that the contour in the u plane may be taken to be 
the imaginary u-axis indented to the right of the origin. 

5. Steady state limit. The state densities for the queue under discussion in 
the equilibrium case are well known [1], [2]. Our object here is to show how 
easily these results follow from the above expressions for the time dependent 
solution. Indeed, by standard Tauberian arguments 


. us ial - i P - Boiss nities 
SS aee t Bae eee Tl ae 


and 


ACI — 8) ( _ ) 
(5.2) lim Ho(s,t) = lim pho(s, p) = ” 
po0+ 


too 


djl — s})—s ’ 





J. KEILSON AND A. KOOHARIAN 


, ds l 
(0 = oe = J 
#50) ($2) A-— 
6. The Poisson /Poisson time dependent queue. When the service time dis- 


tribution is also exponential with mean service time y then 


(6.1) D(x) = ne", 


so that 


(6.2) ap) =; + 5 


The functional equation (4.2), accordingly, becomes 


(6.3) aa eae 
o+p+Ni--«)  ” 


Solving for s, yields 


(p+ +n) + [(p + ¥ + 0)* — 4Anl"” 


Rewriting the expression within the square brackets in the form 

(6.5) \(ptrA+ 9) — 2am)" (p + A + 9) + 2An)"} 

shows that s, has branch points at 

(6.6) p= —(\+ 9) + 2am)? = -—(VA & Vn)’. 

The branch points are thus seen to lie on the negative p axis. By (4.3) and (6.4) 


(6.7) ep) = et rA—o + lip + (Va -— Va) tip t (Vat V0) 
2np 

The branch of e(p) corresponding to the choice of + sign in (6.7) is required 

to insure the vanishing of e(p) as p> +. 

The inversion of e(p) can now be carried out. The contour we choose is in- 
dicated in Fig. 1 below where a finite branch cut has been made between the 
branch points p;, pe as given by (6.6). Taking account of the simple pole at 
p = 0 as well as the branch cut, we obtain 


(6.8) E(t) =1—- oe - . [ e(pder'ap. 

" atl des 
We point out that for p eC, , Re(p) < 0 so that the second term in (6.8) repre- 
sents transient behavior. The integral appearing in (6.8) can be simplified 
leading to 


“e aid \¢ aarp 1/2 uf 
(6.9) BO a1 As" / ((u = mi) (ue = we ay 
” wou, 2nu 





QUEUING PROCESSES 


Fia. 1 


where u,; = —pi, Ue = —ps. We do not pursue the details of this solution 
further since this case has already been discussed by Morse [5] using a com- 
pletely different approach. It is straightforward to bring (6.9) into the form 
obtained by Morse. 


APPENDIX 

By occupation times we mean the time intervals between the state of com- 
plete vacancy. The distribution of these occupation times is well known [2]. It 
may, however, be obtained independently from the time dependent formalism 
developed in the text in the following way. At t = 0, the queue is started in the 
state m = 0, x = 0. Let J(t) be the probability at time ¢ that the system has 
emptied. Then d/J/dt = W,(t) is the probability density function for the oc- 
cupation times. Let U,,(z, t) be the p.df. for the state (m, x) at time t condi- 
tioned on the system’s not having emptied, and let G(s, z,t) = Soo s"U..(z, t). 





112 J. KEILSON AND A. KOOHARIAN 


It is clear that 


(1) W,(t) = dJ = [ n(c)G(0, x, t) dx. 
dt 0 


The boundary conditions U,,(0, t) = [¢ (2) Um4i(2, t) dx for all m imply 


G(s, x,t) — G(0, z, t) ; 


(2) G(s,0,t) = [ w@ - 


1x, 


and, as before, G(s, 0, 4 — x) obeys Eq. (3.10). Thus 
(3) G(s, 2,t) = G(s,0,t — x) exp {—A(1 — s)z — N(2z)} 
+ 6(x — t) exp{|—A(1 — s)z — N(x)}, 


where (x) is the delta function, and G(s, 0, t) is zero for negative t. If one 
substitutes (3) into (2) and takes the Laplace transform, G(s, 0, p) is deter- 
mined by the analyticity condition of Section 4. 

From (1) and (3) we then have 


(4) [ e W(t) dt = 8,, 
0 


where s, is defined by Eq. (4.2). 


REFERENCES 

{1} D. R. Cox, ‘The analysis of non-Markovian stochastic processes by the inclusion of 
supplementary variables,’’ Proc. Camb. Phil. Soc., Vol. 51 (1955), pp. 433-441. 

(2) L. Takacs, ‘Investigation of waiting time problems by reduction to Markov processes,’’ 
Acta Math. Acad. Sci. Hung., Vol. 6 (1955), pp. 101-129. 

|3} D. G. Kenpaut, “Some problems in the theory of queues,”’ J. Roy. Stat. Soc., Ser. B, 
Vol. 13 (1951), pp. 151-185. 

[4] D. V. Linney, “The theory of queues with a single server,’’ Proc. Camb. Phil. Soc., 
Vol. 48 (1952), pp. 277-289. 

[5] P. M. Morse, “Stochastic properties of waiting lines,’ J. Opns. Res. Soc. Amer. Vol. 3 
(1955), pp. 255-261. 





A GEOMETRY OF BINARY SEQUENCES ASSOCIATED WITH 
GROUP ALPHABETS IN INFORMATION THEORY' 


By R. C. Bose anp Roy R. Kvesuer, Jr. 
University of North Carolina 


1. The group alphabet. When a piece of information, or letler is transmitted 
over a symmetric binary channel [14], the letter is presented to the channel in 
the form of a sequence of n binary digits. Because of noise on the channel, there 
is a positive probabiity p that a transmitted symbol will be received in error, 
that is, a transmitted 0 will be received as 1 or transmitted 1 received as 0. It 
is assumed that 0 < p < 3}, and that the noise on the channel operates inde- 
pendently on each symbol that is presented for transmission. If the collection 
of all distinct pieces of information—the alphabet—consists of K = 2* letters, 
it is customary to take n > k, and in some manner use the additional digit 
positions to “‘correct” errors in transmission. Slepian [14] has introduced the 
n-place group alphabet, or, briefly, the (n, k)-alphabet, and an associated de- 
coding scheme. The 2” possible binary sequences form an Abelian group B, 
wherein the group operation is addition modulo 2 of vectors given by the se- 
quences. An (n, k)-alphabet is a 2*-letter n-place binary signaling alphabet 
whose letters form a subgroup of B,. 

Let us designate the letters of the alphabet by 


Uy = I = (000--- 0), U,, U2,-*+,U,,H= 2 —1. 
The group B, can be developed according to the alphabet and its cosets: 
Il=U,=k, Ui, Uz, +: U,, 
Ly ; I, + Ui, i,+ U;, --- K+U0,, 
I, > 8t+U;, htUs, --- n+U,, 


L, ’ L,+ Ui, L, + U2, ee L, + U,, 


where » = 2”* — 1, and L; is an n-place binary sequence which has not ap- 
peared in cosets led by Lo, In, --: , Lis. The group elements L, are called 
coset leaders. 

The weight w(T;) of an element 7; of B, is defined as the number of ones in 
the n-place binary sequence T ; . 

Because of the group property, any coset is repeated, with elements in a 


Received December 1, 1958; revised May 25, 1959. 

1 This research was supported by the United States Air Force through the Air Force 
Office of Scientific Research of the Air Research and Development Command, and by the 
United States Public Health Service through a grant to support a research fellowship in 
biostatistics. 


113 





114 R. C. BOSE AND ROY R. KUEBLER 


different order, if the coset leader is replaced by any other element of the coset. 
It is then agreed that L, will be taken as that element (or any one of these 
elements) of the coset 1 whose weight is least. The detection scheme is then 
the following: if the element of B, which is received from the channel output 
lies in column i of the coset array, the detector prints the letter U; . 

The following is an example (k = 3, n = 5) of such an array. 


I U; U2 U; U, Us Us U; 

Alphabet: 00000 00111 11101 00011 11010 00100 11110 11001 
(2) ( 10000 10111 01101 10011 01010 10100 01110 01001 
Cosets 4 01000 O1111 10101 01011 10010 01100 10110 10001 


00010 OO101 11111 00001 11000 00110 11100 11011 
For such a code, given p and setting gq = 1 — p, 


(3) Q, = Pr(transmitted letter U; be correctly produced by the detector) 


¥ 
i w(L;) n—w(L;) 
2° ¢"". 
l=0 


w nw 


Since p"q" “ is a monotonically decreasing function of w, one sees the motiva- 
tion for taking as L; an element of minimal weight in coset 1. 

As Slepian has observed [14]: “Two important questions regarding (n, k)- 
alphabets naturally arise. What is the maximum value of Q, possible for a given 
n and k and which of the --- different subgroups [alphabets] give rise to this 
maximum Q,? The answers to these questions for general n and k are not known. 
For many special values of n and k the answers are known.” The present paper 
is directed towards developing a geometry which can give an additional tool 
for use in studies on group alphabets. To the reader interested in other aspects 
of the coding problem there may be cited, as representative of analyses of the 
problem and methods of approach, papers of Hamming [10], Gilbert [8], Golay 
[9], Elias [5, 6], Reed [13], Lloyd [11], Calabi and Haefeli [4], MacDonald [12], 
Fontaine and Peterson [7]. 


2. An algebra of binary sequences. We introduce an algebra of binary se- 
quences, defined as follows. The elements of the algebra are the n-place binary 
sequences T,, T:,--- , Tx», where T; = (an, @jy,°-- , Gjn), each a;; being 
either zero or one. In the present case, all letters of the alphabet and all ele- 
ments of cosets are binary sequences of this nature. 

Addition is defined by 


(4) Ti + T; = (Ga + Ga, a2 + Gp, -+-* , Gin + Gjn), 


where the addition is vector addition modulo 2. This addition clearly has all 
the usual properties: commutativity, associativity, inversion. We note a special 
property of this addition: for any n-place binary sequence T, 2T = T + T = 
(000 --- 0), the null sequence. 

The product T ;T ; is defined as 


(5) T iT; = (Gadg , GnQj2,-** , AinOjn); 





GEOMETRY IN INFORMATION THEORY 115 


that is, the coordinates of the product are the products of matching coordinates 
of the factors. This multiplication of -course has all the usual multiplicative 
properties: commutativity, associativity, distributivity. However, inversion is 
not satisfied; that is, division is not unique. We shall not perform the division 
operation. Since 


antiaiael {1 when and only when a,, = aj, = 1, 
“* ~ \0 otherwise, 


a special property of the multiplication in this algebra is that every element is 
idempotent; that is, for any n-place binary sequence T, 


(6) T’ = T. 


Two particularly useful properties follow from (6). For any two sequences 
T; , T2 ’ 


(7) (UT; + 7:T2)(T2 + T:T2) = T1T2 + TIT: + 7:7: + TiT: = 47,72 = 0, 
(8) (Ty + T:)(T:T:) = TIT: + T:Tt = 27:72 = 0. 


Consider the weight w(7') of two binary sequences T; and 7; , where w(T') 
is as defined in Section 1 (the number of unities in 7), and let us investigate 
w(T; + T;) and w(T;T;). An example of these T’s could be 


T:( 11 0), 
T::(0 0 1 0), 


1 10 

. as 
1,i.+7::(1 10001 0), 

.-. . 


TiT:: (0 0 1 0). 


TT; has unities only in those positions occupied by unities in both 7, and 7, , 
so that w(T;7;) is the number of unit coordinates common to 7’; and T, . These 
are precisely the unit coordinates yielding zeros in the sum T; + 7; . Thus we 
have the following theorems. 

THeoreM 1. w(7;T;) S min [w(T;), w(T;)). 

TuHeoreM 2. w(7; + T;) = w(T;) + w(T;) — 2w(T,T;). 
Useful corollaries to Theorem 2 are the following. 


Coroutiary 2.1. If w(T,T;) = w(T;T.) = w(T,), then w(T;T,) 2 w(T,). 
Proor. The theorem gives 


(9) w(T; + Ty.) = w(T;) + w(T,) — 2w(T;T,). 
But also 
w(T; + Tr) = w[((T, + T;) + (Ti + Tr)) 
w(T;, + T;) + w(T; + Tr) 
w(T;) + w(T;) — 2w(T.7T;) + w(T;) + w(T,) — 2w(TT;) 
= w(T;) + w(T.) — 2w(T;). 
Applying this result to (9), we obtain immediately w(7T;T,) 2 w(T;). 





116 R. C. BOSE AND ROY R. KUEBLER 


Corouuary 2.2. If T; = T;T; and w(T,;) = w(T;), then T; = T;. 
Proor. From the given conditions we have 


w(T; + T;) = w(T;) + w(T;) — 2w(T.T;) 
= w(T;) + w(T;) — 2w(T;) = 0. 


: That is, 7; + 7; is the null sequence, whence T; = T';. 

TuroreM 3. The necessary and sufficient condition that w(T;T;) = w(T;) is 
that T;T; = T;. 

Proor. If 7;T; = T;, then obviously w(7T,;7;) = w(T;). Conversely, let 
w(T;T;) = w(T;). Before applying the condition, we have from the definition 
of multiplication that 7;7; has zero for each coordinate which is zero in T;. 
The remaining coordinates of 7;7'; are those which in 7; are unities. But if 
w(T;T;) = w(T;), then these coordinates must be unities in 7;7; . Hence, the 
coordinates of 7';T'; are identical with those of 7; , that is, 7;,T; = T;. 


3. A geumetry of binary sequences. Application is made of the notions of 
finite projective geometry introduced by Bose [1] and used by him and others 
in the development of incomplete block and factorial designs (for example, 
{1] and [2}). 

The group alphabet in which we are interested is composed of the null letter 
I = (000---0) and » = 2 — 1 nonnull letters, U;, U:,--- , U,, which 
are generated by any k independent sequences, say 


U, = (ay, ie, Mis, *** , Gin), 


(10) U, (Gm , G22, G23, °** 


= (Ger , Qe , Aes , Pet » Aen), 


where the a;; are elements of the Galois field GF(2) and not all zero. The general 
nonnull letter U of the alphabet is thus 


(11) U = Ui + U2 +--+ +AU, 


where \;, Ax, °**, Ax are elements of GF(2), not all zero. For example, the 
nonnull letters of the alphabet (2) can be taken as: 


= (00111) = 1(U,;) + 0(U2) + O(Us), 

(11101) = 0(U;) + 1(U2) + 0(Us), 

= (00011) = 0(U,) + 0(U2) + 1(Us), 

(11010) 1(U,) + 1(U2) + 0(U;), 

(00100) 1(U,) + 0(U2) + 1(Us), 

(11110) = 0(U;) + 1(U2) + 1(Us), 

fy = (11001) = 1(U,) + 1(U2) + 1(Us). 

Our geometry must take into account these 2° — 1 letters, and also all the re- 


maining 2" — 2‘ possible nonnull binary sequences. 
Consider a (topological) space 2 consisting of n distinct points Y,, Y2, --- 





GEOMETRY IN INFORMATION THEORY 117 


Y,, where the point Y,; of Q is considered to correspond to the i-th position in 
an n-place binary sequence. In other words, @ is a space of positions. Each binary 
sequence T; = (aj, 4, -°** , @j,) corresponds to a unique subset of 2, namely, 
the subset of those positions which are occupied by unity in 7; . For example, 
if n = 6, the binary sequence (011001) corresponds to the subset (Y;, Y3, Ys) 
of 2. Thus, Y; is a member of the subset 2; corresponding to 7’; if and only if 
a;; is unity. Conversely, given any subset 2; of 2, we can at once write down the 
corresponding binary sequence 7; by taking unities in those positions which 
correspond to the elements of 2; , and zeros in the other places. For example, 
ifn = 7 and Q; = (Yi, Ye, Ys, Ye), we have at once T; = (1101010). Thus 
the 2” binary sequences have (1,1) correspondence with the 2” distinct subsets 
of 2. In particular, the whole space 2 corresponds to the sequence E = (11 1 
-++ 1), the unit element of the ring algebra introduced in the preceding section, 
and the null set corresponds to the sequence J = (000 --- 0), the zero ele- 
ment of the ring. 

As any other sequence, the letter U; of the alphabet (11) corresponds to 
the Q-subset of those positions in which U; has unities. We shall denote this 
subset by 0,(U;). We shall denote by 2( U;) the complementary set of positions 
(which are occupied by zero in U;). These two sets are disjoint, and their union 
gives the whole space 2. Referring to (12) for example, we have 2,(U;) = 
(Y:, Ye, Ys) and &(U;) = (¥3, Ys). 

For any k-place sequence 4, y2,°-* , Ye Of elements of GF(2), we define 
the subset Q(y:, ye2,°-*, Ye) by 


(13) Q(yr, Yo, °**, Ye) = Y (UN D,(U2)N ---N DW CUe}. 
For example, referring again to (12), we have 
2(1,0,1) = 2(0,) A %( U2) N (U5) 
(¥:, V4, ¥s)N (Ya) (Ve, Ys) 
= (¥,). 


As shown by the definition (13), an element of 2 is a member of Q(y, ys, 
«++ , ys) if and only if it is a position which is occupied by y, in U; , yin U2, - 
yx in U, . Each such position will then be occupied by Avi + Awe + -** + Aye 
in U = \\U; + AU. + --- + Uy. Thus, if in our preceding example, where 
2(1, 0,1) = Y4, we consider (A, , Ax, As) = (1, 1, 0), we have 


Aisa + Aye + Aya = 1(1) + 1(0) + O11) = 1, 
wnich is seen to be the digit occupying the fourth position in 
1(U,) + 1(U2) + 0(Us) = (11010). 
Hence, if 


(14) Mii + Agys + +++ + eye ¥ O 





118 R. C. BOSE AND ROY R. KUEBLER 


then each element of Q(y:, ye, -** , Ye) is a member of the set 2,(U) corre- 
sponding to U = \\U; + U2 + +--+: + Ue; otherwise, each element of 
Q(y:, Y¥2,°** » Ye) is a member of the complementary set %(U). 

We can assume that there is no position which is occupied by zero in every 

one of U,, Uz, +--+ , Ux, for otherwise this position would be occupied by zero 
in every letter of the alphabet and would therefore convey no information. 
Hence the set 2(0, 0, --- , 0) is always null, and we shall neglect it. Thus there 
are 2° — | different sets 2(y; , y2,°-* , Ye), (Yi, Y2,*** » Ye) ¥ (0,0, --- , 0). 
Any two distinct sequences y; , y2,°-* , Ye Will differ as regards at least one 
element, and hence the two sets Q(y:, y2,°-* , yx) will differ with respect to 
at least one factor in (13), say the u-th. Then we shall have 0,(U,) in one 
case, and {)(U,) in the other. These two sets are disjoint, and hence so are the 
two sets {1(y , yz, *** » Ye), Since each is a subset of each of its factors. Further- 
more, since each position is clearly defined in every U; , every element of © is a 
member of some Q(y: , yo, °** , Ye); one need only write down for y,, ye, ++: , 
yx the digits occupying the corresponding position in U,;, U2,---, Ux [in 
(12) for example, Y; is an element of 2(1, 1, 0)). Hence, as (my, y2,-*+ , Ye) 
runs over the 2° — 1 possible sets of values, the sets 2(y: , y2, °-* , yx) exhaust 
2. Thus the sets Q(y: , yo, -** , Ye) are disjoint, and their union gives the whole 
space. 
From what has been stated concerning (14) above, it is clear that the set 
2,(U) corresponding to U is the union of all the (disjoint) sets Q(y: , ye, ---, 
yx) for which (y: , ye, °** , Ye) satisfies (14). If n(y, ye, +++ , ye) denotes the 
number of points in Q(y; , yz, °-* , ye), then, since the sets Q(y%1, ye, ++ , Ye) 
are disjoint and exhaust 2, >. n(y:, y2, °°: , yx) = n, the summation being 
over all the 2° — 1 values (y;, y2,--- , ye). Also, for the weight w(U) of U, 
as defined in Section 1, we have 


(15) w(U) = > nly, ye, °°» Ye)s 


where >- indicates summation over all those values (y; , yo, °°: , ye) satisfying 
Mii + Agye +--+ + Aye = I. 

Consider now the finite projective space PG(k — 1,2), and to the point 
P = (m1, 42, °°: , Ye) of this space associate the set 2(y: , y2, +--+ , ye). Let the 
points of PG(k — 1, 2) be Pi, Pz,-+++, Py, » = 2° — 1, where Py = (yx, 
Yoi, *** » Yei). If we define the n-measure of the point P; as n(P;) = n(yx, 
Yoi,*** , Yei) = n;, then there are n; points of 2 which constitute the set 
associated with P;. These points we may now rename as Py, Pe, --- , Pin;, 
and identify with the point P,; taken n; times. Thus 2 may be considered to 
consist of the points P,;, P:,---, P,, the point P; being taken with a multi- 
plicity n;. If in particular n; = 0, then P; does not belong to ©. It is useful 
to institute a logical distinction between geometric points and ©-points. Each 
P; constitutes a single geometric point, but counts as n; = n(P;) Q-points. 
The total number of geometric points is 4 = 2° — 1; the total number of Q- 
points is n. 





GEOMETRY IN INFORMATION THEORY 119 


The points (y: , ye, *-*, Ye) which satisfy (14) are the points not lying on 
the (k — 2)-flat 


(16) Ait + Agye2 + °° + Acme = O. 
This (k — 2)-flat we shall call the U-associated flat, where 
U = \U, + AUe +--+ +AU; 


The set 2,(U) corresponding to U is then the union of the sets Q(y:, ye, °°: , 
ys) associated with those points (y:, ye,-**, ye) of PG(k — 1, 2) which do 
not lie on the U-associated flat. We shall call such points U-associated points. 
Hence, from (15) 


(17) w(U) = the number of U-associated points in Q. 


This result is a special application of a property studied in another connection 
by Bose and Burton in [3). 

As is clear in all the preceding discussion, the ordering of the points in Q is 
completely immaterial. However, for notational convenience, and to fix ideas, 
we shall ordinarily take Y,, Y2,---, Ya = Pu, Pu,-::, Pu,, Pu, Pa, 
-** , Pong, °°» Pu, Par, *** » Pon, The subset (Pa, Pa, --- , Pin,) may be 
represented by P?'; it must be understood to be empty when n,; = 0. The 
correspondence between the points of @ and the generating letters U,, U:, 

- , U, of the alphabet can be exhibited in the following array. 


2-point 





+ Pa s+ Pang 


Yu 
Yu 
Ya 


U; Sea: Daa *** Be: Be ae. y™ 22% \SeiicoGits “°° Rie 


This is precisely the array (10) with respect to the U’s. Each U is given by a 
horizontal sequence of n digits. But now we can see the columns of the array 
as the sets of homogeneous coordinates of points in PG(k — 1, 2). Thus, U,, 
U;,--:, Uy, and hence all the letters of the alphabet, are completely defined 
by the ordered set of 2-points. Hence the study of group alphabets can be pur- 
sued through the study of sets 2 composed of points of PG(k — 1, 2). 

We should call attention at this point to the correspondence between our 
geometric representation and the group representation of Slepian. Slepian [14] 
employs the isomorphism of B, , the group of n-place binary sequences under 





120 R. C. BOSE AND ROY R. KUEBLER 


the operation of addition modulo 2, with the abstract group C, generated by n 
commuting elements of order 2, together with the isomorphism of the (n, k)- 
alphabet (subgroup of B, of order 2°) with C,. Rows 2 through (k + 1) of 
Slepian’s modular representation table for the group C, give, columnwise, pre- 
cisely the homogeneous coordinates of the points P; of PG(k — 1, 2). When 
Slepian forms an (n, k)-alphabet by choosing n columns (including possible 
repetitions) of the modular representation table, the rows 2 through (k + 1) 
of the resulting array is, to within possible permutation of columns, precisely 
our representation of 2. Our measure ng for the point Ps is Slepian’s quantity 
dg , both indicating the number of times the 8-set of coordinates is taken from 
PG(k — 1, 2), or from the modular representation table, for forming the 
(n, k)-alphabet. 

Obviously, if ”., is outside Aw, + Awe + --- + Aye = O for any u, it is 
outside the flat for all u. That is, all repetitions of P; in © are U-associated points. 
On any (k — 2)-flat in PG(k — 1, 2) there are 2°"' — 1 points, so that there 
are (2° — 1) — (2** — 1) = 2°" points of PG(k — 1, 2) lying outside the 
flat. Hence the letter U = \,U; + U2 + --- +AU is uniquely defined 
by the Q-points contributed by the 2°" geometric points lying outside 


Aya + Aye + e+ + AY = 0 


in PG(k — 1, 2). As suggested in the remark preceding (18), we shall use 
exponent notation to indicate multiplicity, setting 


(19) U= Pis' PGs" Na P5S ’ = 2" . 


The order of the P,’s in (19) is clearly of no importance, nor are the exponents 
essential, since the symbol P; by itself specifies that U has unities in those 
positions occupied by Pa, Pa, --- , Pin; in the ordered sequence of 2-pcints 
Y,, Y2,--:, Ya. Thus there is a (1,1) correspondence between a letter 


U = vy\U, + U2 +--+ + Ui 
and the set of geometric points 
(20) (PiPi +++ Ps) 
which lie outside the (k — 2)-flat Ayws + Awe + -°+ + Aye = O. 


Let § represent the complement of the set S with respect to the entire space, 
here PG(k — 1, 2). From (20) we have the useful correspondences: 


(U:{P.,, Pig, +++, Ps, lying outside Ay + Ave +--+ + Aye = OF, 
(21) <U: {Py lying on AN + AgY2 + ees + Age = 0} 
or simply Ayr + Aaya + ++ + Anye = O. 


The letter U can thus be viewed in any one of the following ways. 
(i) U is a sequence of n binary digits. As such, U is an element of the ring 
algebra introduced in Section 2. 
(ii) U is the set 2,(U) of U-associated points in 2. As such, U is given by (19). 
(iii) U is the set of geometric points of PG(k — 1, 2) lying outside the U-asso- 





GEOMETRY IN INFORMATION THEORY 121 


ciated flai. In this view, the multiplicity n; is understood as attached to the 
point P;, and U is expressed by (20). 

For example of this geometry, we shall consider again the alphabet in (2). 
Here k = 3,n = 5, PG(k - 1,2) is the projective plane PG(2, 2), and (k — 2)- 
flats are lines. The situation is presented in Figure 1. 


> 


Ui: % = Us:yt+ 


0 
Ur: wh = Us:m +y=0 
i ) 0 

0 


U;: ve = Us: Yt = 


’ 


Ur:wn~t+wetyn= 


One alphabet design assigns measures n(P;) = n,; to the points P,; as shown in 
the following table. The subsets of associated 2-points are then as indicated. 


Geometric points: P; Py, P; Po Py Po P; 
n(P,): 0 2 0 1 1 0 1 
Associated subsets of 2: null Y;, Y2 null Y; Y, null Y; 


Taking points of 2 in column form, as in (18), we have 


Positions: Y; Y3 Y; Y, 
Geometric points: Pn Pn Pa Psu 


0 O 1 1 
2:41 l 1 0 
0 oO O 1 


P, : (1,0,0) 


Ps: (0,0/) 





122 R. C. BOSE AND ROY R. KUEBLER 


Then 
U, = (00111), U; = (11101), U; = (00011), 


and the alphabet follows as in (12). The various identifications of the letters 
are as follows. 


Set of 
associated 
2-points 


Binary | Set of geomet- | 
| sequence rie points 


Letter U = Ui +.U2+r Us 


Ur = 1(U;) + 0(Us) + 0(Us) | (00111) | (P:PsPsP2) | PLPLPIP? 
Us = 0(U;) + 1(U2) + O(Us) | (11101) | (P2PsPoP2) | PiPsPoPi 


Us = 0(U:) + 0(U2) + 1(Us) | (00011) | (PsPsPeP:) | PRPSPIP! 
i, = 1(U,) + 1(U2) + O(Us) (11010) | (PiP2PsPs) | PiP2PsPs 
Us = 1(U;) + 0(U2) + 1(Us) | (00010) | (P\PsP.Ps) | PiPsPiPs 


| 
Us = 0(U;) + 1(U2) + 1(Us) | (11110) (P2P:P.Ps) | P2PiPiP3 
U; = 1(U;) + 1(U2) + 1(Us) | (11001) | (P.P2PsPr) | PiPiPsPi 





To reiterate the meaning of the exponent notation relating to 2-points, we note 
the example 


Us = PiP2PsPt = PuP2Pu . 


Consider now the binary sequences which are not letters, that is, the sequences 
which are members of cosets. As we saw when the Q-set was introduced, every 
binary sequence is in correspondence with a subset of 2. Hence each nonletter 
can be identified with a subset of 2-points; as in the case of letters, we shall 
call the points of such a subset the associated points of the sequence. But the 
Q-set corresponding to a nonletter does not necessarily include all n, repetitions 
of P; as in the case of a letter U. Thus, for example, in the array (2), L = 
(10000) = Px , or again, L + U, = (10000) + (11110) = (01110) = PaPaPa. 
Hence, in general the binary sequences which are not letters can not be identi- 
fied with sets of geometric points as can the letters. Of course there will be special 
cases in which the Q-subset defining a nonletter will contain all n; repetitions 
of P; for all i in the subset, and indeed these special cases as they affect coset 
leaders L are of particular importance in design investigations. 

Some observations should be made on the relation between operations per- 
formed in the algebra of binary sequences and operations performed on corre- 
sponding sets. By the definition of multiplication (5), T;7'; has unities in just 
those positions which are occupied by unities in both T; and T; . Hence T,T7; is 
defined by the subset of 2-points associated with both T; and 7; . Thus, when 
regarded as sets of 2-points, 


(23) TT; = Tn was 





GEOMETRY IN INFORMATION THEORY 123 


By the definition of addition (4), T; + 7; has unities in those positions occu- 
pied by unities in one of T,; and T; but not in both. Hence T; + 7, is defined by 
those associated points of T; and 7; which are not common. Thus, regarded 
as sets of 2-points, 


(24) : T. +7; = 7,UT;-—{TN Ti. 


When each member of the pair T;, 7; is u sequence whose set of associated 
Q-points includes Pa, Pa,--:, Pin, whenever it includes any P,, [such se- 
quences are the letters U and the special cases of nonletters mentioned above], 
then (23) and (24) apply with the sequences regarded as sets of geometric 
points (P;). 


4. Geometric conditions in a group alphabet. As shor in the preceding 
section, the construction of a binary signaling (n, k)-alphabet is equivalent to 
the selection of a set 2 of n points from PG(k — 1, 2), the geometric point P; 
appearing n, times in 2. The selection of © is in turn equivalent to the distribu- 
tion of a total measure n over the points of PG(k — 1, 2), whereby the non- 
negative integral measure n(P;) = n, is attached to the point P;, 


(25) 22 p= 2*—1, 


We define the n-measure N; of the j-th (k — 2)-flat U; of PG(k — 1, 2) as 
(26) N;= Din, j=1,2,---,m, 
0; 


where Dy, indicates summation over the points P; which lie on U; . Since 
every point of PG(k — 1, 2) is on 2’ — 1 (k — 2)-flats, summing (26) on j 
gives 


(27) > Ny = (2°* — 1)n. 
3 


Consider now any point P,. Any point of the space other than P, determines 
with P; a line, and there are as many (k — 2)-flats “on” a line as there are 
points on a (k — 3)-flat {by duality], namely, 2°” — 1. Hence each point 
of the space other than P, lies on 2"* — 1 of the (k — 2)-flats passing through 
P;. Thus, if we sum (26) over the (k — 2)-flats containing P; , we obtain 


> N; = (2°* — 1)n, + (2°? — 1)(n — 24) 
(28) a 
(2°* — 1)n + 2° *n,, i= 1,2,---,p, 


where >>», indicates summation over those j indexing the (k — 2)-flats which 
pass through P, . 

We shall call the space of the points P; and (k — 2)-flats U, the primary 
space. Corresponding to this primary space is the dual space, in which the point 
T, corresponds to the (k — 2)-flat U; of the primary space, and the (k — 2)- 
flat x; corresponds to the point P; of the primary space. Each space is a projec- 





124 R. C. BOSE AND ROY R. KUEBLER 


tive space PG(k — 1, 2). In the dual space we define a w-measure which assigns 
to the point T,; the integer w;, where 


(29) =e, ee N;, J = 1,2,--- > 


By (19), the weight w(U,;) of the letter U; is ni, + ni, + - v+ > Me, (n = 2°"), 
the sum of the n-measures of all points P; lying outside U;. Hence, 


(30) w(U;) =n—N; = w;, j=1,2,-*+\m. 


That is, the w-measure of a point T in the dual space is the weight w(U) of the 
letter whose associated flat U in the primary space is the dual of T. If we sum 
(30) on j, applying (27), we obtain 


oo 
(31) > w(U;) = > w; = (2 — 1)n — (2°* — 1)n = 2". 
j=l j=l 
This is Slepian’s Proposition 6 [14]. 
Since the distribution of the measure n assigns nonnegative integers to the 
yp = 2° — 1 points of the primary space, it is convenient to express n as ut + 7, 
and set n; = t + 6,(6; 2 —t). Taking into account that n 2 k, we have 


(32) n= (2* — 1)t+y¥, 
where ¢ is a positive integer or zero, and 


( ke ‘ . 
i eh ie EE Boe Mas Wee 
v= 2b =) ht 1,208 =... At 


This representation has the advantage that, for given k, it reduces the problem 
of constructing (n, k)-alphabets for all n to the problem of constructing 2° — 1 
y-classes of (n, k)-alphabets. 

Since to each letter U there correspond 2“ points of the primary space 
[ef. (19), (20)], the weight w(U,;) of U; is of the form 


(33) w(U;) = 2° "t+ b;, 


t=1 


where b; is the sum of the 4,’s over the points corresponding to U; . 

Now, an essential feature of the code associated with an (n, k)-alphabet is 
the following. When the letter U is transmitted, the detector will correctly 
report U if and only if the errors in transmission occur in precisely those posi- 
tiens occupied by unity in a coset leader L. Hence, if all possible n-place se- 
quences containing s unities serve as coset leaders, then the code will correct 
all s-tuple errors. If the number of weight-s sequences occurring as coset leaders 


is less than (") , Say a, then the code corrects a s-tuple errors. The advantage 


of maximizing the number of lowest-weight sequences serving as coset leaders, 
discussed in Section 1 with reference to maximizing the probability of correct 
detection, now sppears again, this time with reference to maximizing the num- 





GEOMETRY IN INFORMATION THEORY 125 


ber W such that all W-tuple (and lower order multiple) errors are corrected by 
the code. 

As shown in Section 3, every n-place sequence, whether letter U, coset leader 
L, or interior coset member L + U, is identified by certain Q-points associated 
with the sequence. Since addition is modulo 2, the binary sequence L + U con- 
tains a zero in each position identified with an 2-point which is an associated 
point of both L and U. If w{L) = g, there are g 2-points associated with L. 
The class of Q2-subsets representing all possible sequences of weight g embraces 
all possible combinations of g points out of the n Q-points, including any set of 
g Q2-points associated with a letter U. Now, all weight-g sequences can be coset 
leaders if and only if w(Z + U) > w(L) for all U’s and all weight-g L’s. Hence, 
in order that all weight-g sequences can serve as coset leaders, it is necessary 
and sufficient that g < 4w(U,;) for all j. Define 


(34) W = the largest integer such that all sequences of weight 
<W can serve as coset leaders. 


Then W is the largest integer such that there exists an (n, k)-alphabet in which 
w(U;) > 2W for all j, whence the well-known condition w(U,;) 2 2W + 1 
for all 7. Considering the form (33) of w(U;), we set 


(35) W=2*t+e, 
where e is the largest integer such that there exists an (n, k)-alphabet in which 
(36) w(U;) = 2W +1 = 2 't+ 241 forallj. 


For given k, y, and e, we confine our attention to the construction of only 
those (n, k)-alphabets which satisfy (36), that is, (n, k)-alphabets which 
provide W-error-correcting codes. Such codes have been termed largest-nearest- 
neighbor-distance group codes. There has been no demonstration that this class 
includes that code which has the smallest probability of incorrect decoding for 
all values of p < 1/2, but the class does include such a code for sufficiently 
small values of p, and the class has the desirable feature of maximizing the multi- 
plicity of error which will be completely corrected. For such an alphabet, the w- 
measure in the dual space must, in view of (30), satisfy 


(37) w; 2 2W +1 for all. 
Set 


w,; = 2W+1+44d;, 


where d; is a positive integer or zero. We now define as D-measure a measure which 
assigns nonnegative integers d; to the points T,; of the dual space. (It will be 
unambiguous to refer to an individual d; as the D-measure of the point T,, 
and to the sum of the d,’s for all points on a c-flat o, as the D-measure of ¢, .) 
Since (37) is sufficient as well as necessary for an (n, -)-alphabet to give a W- 





126 R. C. BOSE AND ROY R. KUEBLER 


error-correcting code, provided the w-measure is otherwise consistent with an 

(n, k)-alphabet, it follows that any D-measure which is consistent with an 

(n, k)-alphabet will provide a sufficient condition for the alphabet to give a 

W-error-correcting code. The conditions on the D-measure are readily identified. 
First, if we sum (38) on j, applying (31), (32), and (35), we have 


an = (2° —1)(2QW +1) + Yod;, 
a2" — 1)t + y] = (2° — 1)(2° + 2e +:1) + Did;, 
(39) Dd dy = 2 'y — (2° — 1)(2e + 1). 


Next, if we sum (29) over those 7 which index the points T; lying on that 
(k — 2)-flat x; which is the dual of the point P; of the primary space, we obtain 


i vj; = (7" -~i)a- DN;, 

™; P; 
where >|; indicates summation over those j indexing the elements which are 
“fon” & Then by (38), (28), (35), and (32) we have 


(2°? — 1)(2W + 1) + Dod; = (2 — 1)n — (2 * -1)n — Dn, 


(2°* — 1)(2°"t + 2e +1) + Dod, = 2°°7[(H — 1) +) — 2 *n,, 


(40) ng =t+y7—2(2e +1) — (1/2"*) [Hd; — (2e + 1). 
Since n;, t, y, and e are integral for all 7, (40) shows that 
(41) >: d; = 2e + 1 (mod 2°”) for all i. 


For defining uniquely an n-measure over the points P; of the primary space, 
the equalities (39) and (40) are clearly sufficient as well as necessary, provided 
that all the n; given by (40) are integral and nonnegative. We have thus es- 
tablished the following theorem. 

THEOREM 4. Given any k, y, and e, where y and e are functions of n in accord- 
ante with (32) and (35), respectively, the necessary and sufficient conditions that 
a D-measure uniquely define a y-class of n-measures over the points P; of PG(k — 1, 
2), and thence define, uniquely to within ordering of Q-points, a y-class of (n, k)- 
alphabets which give W-error-correcting codes (where W is the largest integer for 
which an (n, k)-alphabet exists) are 


, (1) ya, ary -- (2 — 1)(2e + 1), 


j=l 


(2) > d; = 2e + 1 (mod 2°’) for all i, 





GEOMETRY IN INFORMATION THEORY 


(3) n; = 0 for all z, 


where n,; is given by (40). 

If Theorem 4 is restated in terms of n, W, and the w-measure, it is the equiv- 
alent of Slepian’s statement at the end of Section 2.9 of [14]. Similarly, such 
restatement of Theorem 6, below, taken in conjunction with Corollary 5.1, is 
the equivalent of Slepian’s Proposition 7. In addition to giving unity to the 
geometric approach and providing tools for later geometric work, the present 
propositions, for any fixed k, organize matters relating to an infinite number of 
n-values into just 2* — 1 y-classes. 

The congruence condition in Theorem 4 is a special case of a more general 
property given by the following theorem. 

Tueorem 5. Any D-measure satisfying Theorem 4 has the property 


(42) > d; = 2e + 1 (mod 2°) 


for all c-flats o, in the dual space PG(k — 1, 2),c = 1,2,++-,k— 1. 

Proor. For c = k — 1, the congruence follows at once from condition (1) 
in Theorem 4. Now consider any c-flat o, in the dual space,c = i,2,--- ,k — 2. 
Summing (38) over the points which lie on ¢, , we have 


> w; = (2°*" — 1)(2W +1) + Dod, 


= (2° — 1)(2°"4 + 2e+1) + Day 
by virtue of (35). Thus, since c S k — 2, 
(43) Dd; = (2e + 1) + Dw; (mod 2°). 


Now, if we designate by S,_». the flat, of dimension k — 2 — c, which in the 
primary space is the dual of o,. , we have from (29) 
(44) wy = (2 —1)n- De N;. 

Ce k-2-—6e 
The number of (k — 2)-flats which are “‘on” (pass through) S,-2. is by duality 
the same as the number of points on a c-flat, namely 2°** — 1. Each of these 
(k — 2)-flats contains all the points of S,... Further, any point outside 
Sr». determines with S,.. a (k — 1 — c)flat, through which pass 2° — | 
(k — 2)-flats; that is, every point of PG(k — 1, 2) which is not on S,_»-, is 
on 2° — 1 of the (k — 2)-flats which pass through S,... Hence 

LNs = (27-1) YL mt (2-V(a- DY ni) 


=(2—i1)n+2 > ni, 


Sk_t—e 


giving in (44) 


a w; = 2[n — > ni, 


"e Sr_2-¢ 





128 R. C. BOSE AND ROY R. KUBBLER 


whence, since n and all n,’s are integral, 


> w; = 0 (mod 2°). 


This result applied to (43) gives the congruence stated in the theorem. 

The congruence condition in Theorem 4 is the special casec = k — 2. Another 
special case of particular importance is given in the following corollary, taking 
c= 1. 

Coro.uary 5.1. For any D-measure satisfying Theorem 4, the D-measure of 
every line in the dual space is odd. 

The means of satisfying Corollary 5.1 are given by the following theorem. 

THEorEM 6. A necessary and sufficient condition that a D-measure assigning 
nonnegative integers d; to the points T; of PG(k — 1, 2), k = 2, shall be such 
that the D-measure of every line is odd is that either every point of the space has odd 
D-measure, or every point of one (k — 2)-flat x* has odd D-measure while all 
points outside «x* have even D-measure. 

Proor. I. Sufficiency. Suppose there is associated with the j-th point of 


PG(k — 1, 2) 


the measure d;, j = 1, 2,--- , u, such that d; is a positive integer or zero. If 
all the d; are odd, then clearly the sum of the measures of the three points on 
any line is odd. If the d,’s associated with the points of a specified (k — 2)-flat 
m* are odd, while all the remaining d,’s are even, then the situation is as follows. 
All the lines lying wholly within x* are clearly of odd total point measure. Any 
line not lying wholly within x* contains one point of r* and two points outside 
x*; since both the latter points are of even measure, the total point measure of 
the line is odd. 

Il. Necessity. Suppose that the measures d; have been assigned to the yu points 
of PG(k — 1, 2) so that each point measure d; is a positive integer or zero, and 
the sum of the measures of the three points on any line is odd. Consider first 
the case k = 2. We are then dealing with the projective line PG(1, 2), in which 
(k — 2)-flats are points. There is just one line in the space, and that line by 
hypothesis has odd D-measure. Obviously, either d,; , dz , and d; are ali odd, or 
one of these d,’s is odd and the remaining two are even. Hence, the conclusion 
stated in the theorem holds when k = 2. 

Let us now assume that the stated conclusion follows from the hypothesis when 
k = u, where u is any integer equal to or greater than 2. That is, given that 
every line in PG(u — 1, 2) has odd D-measure, either all the points of (u — 1)- 
space have odd D-measure or the points of a specified (u — 2)-flat have odd 
D-measure while all the remaining points of the space have even D-measure. 
For ease of reference, we shall call a point odd or even according as its D-measure 
is odd or even. 

Consider now a projective u-space PG(u, 2) satisfying the hypothesis. Then 





GEOMETRY IN INFORMATION THEORY 129 


by our assumption concerning the nature of the measure in (u — 1)-space, 
every (u — 1)-flat in PG(u, 2) is of one of two kinds: 

first kind: all points are odd, 

second kind: all the points of one (u — 2)-flat are odd, and all the remaining 

points of the (uw — 1)-flat are even. 
Hence in any (u — 1)-flat of the u-space there is at least one (u — 2)-flat 
containing only odd points. Take such a (u — 2)-flat, say 2, and consider the 
three (u — 1)-flats passing through it, say ¥;: , ¥2 , ¥s , keeping in mind that these 
three (u — 1)-flats exhaust the u-space of points. Take a line m which does not 
intersect 2. That such choice is possible is seen from the following lemma. 

Lemma 6.1. In PG(u, 2) there are 2°" lines which do not intersect an arbitrary 
(u — 2)-flat 2. 

Proor or Lemma. There are 2" — 1 points in 2. Through any one of these 
points there pass 2" — 1 lines of PG(u, 2). The number of these which lie en- 
tirely in 2 is the number of lines passing through a point in (u — 2)-space, 
namely 2“-* — 1, Also, the total number of lines in PG(a, 2) is [1], 


(2°** — 1)(2* — 1)/(2 — 1)(2' — 1) = 4(2*" — 1)(2* — 1). 
Hence, the number of lines intersecting 2, including those which lie wholly 
within 2, is (2°" — 1)[{(2* — 1) — (2°* — 1)} + 4(2°" — 1)(2°* - 1) = 


4(2""" — 1)(2**" + 2°" — 1). Thus, since the total number of lines in PG(u, 2) 
is 4(2"*' — 1)(2* — 1), the number of lines which do not intersect = is 


ua” = 1)(2" a 1) as ua" pe 1)(2*** + gt e. 1) =m st 


This establishes the lemma. 

The line m will intersect each of ¥: , ¥2, vs in a point. Say these points are 
P,, P:, Ps, respectively. 

(i) If m is of the first kind, P; , P: , and P, are all odd, so that, since all the 
points of 2 are odd, each of ¥;, ¥2, ¥s must be of the first kind, whence all the 
points of PG(u, 2) are odd. 

(ii) If m is of the second kind—say P, is odd and P;, P; even—then y¥, is 
of the first kind and ¥, ys are of the second kind, whence all the points of 
PG(u, 2) lying on y; are odd and all the remaining points of PG(u, 2) are even. 

Thus, either all the d,’s are odd, or the d,’s associated with the points of one 
(u — 1)-flat are odd and the remaining d,’s are even. 

Hence, the stated conclusion follows from the hypothesis when k = u + 1 
provided the same is true for k = u. Since we determined at the outset that the 
implication holds when k = 2, the same result for any integral k 2 2 follows 
at once by induction. 


5. Determination of W. When k is fixed, W—the largest integer such that 
all sequences of weight <W can serve as coset leaders—is a function of n. We 
may write W = W,(n), where the subscript k indicates the size (2°) of the 
group alphabet. There is also the inverse function n = W;i'(W). 





130 R. C. BOSE AND ROY R. KUEBLER 


TuHeoreM 7. For a given kk, W = W;(n) is a monotonically nondecreasing func- 
tion of n, specifically 


(45) Wi(n) Ss Wi(n + 1) S Win) +1, 


and n = W;'(W) is a monotonically increasing function of W. 

Proor. Given W = W,(n), there exists an n-measure over the points of 
PG(k — 1, 2) such that w(U;) 2 2W + 1 for all letters U; , that is, such that 
the points lying outside any (k — 2)-flat have total n-measure equal to or 
greater than 2W + 1. When n is changed to n + 1, we may amend the original 
n-measure by simply adding 1 to the n-measure of any one particular point, 
say P. Then clearly the total measure of the points lying outside any (k — 2)- 
flat is not reduced. Indeed, such measure remains the same for every set of 
points lying outside a flat which contains P, and increases by 1 for every set of 
points lying outside a flat which does not contain P. Thus, the weight of every 
letter U; of the alphabet is at least as large as it was under the original measure, 
so that Wi(n + 1) 2 W;(n). The two-sided bound (45) states that the jump 
in value of W,(n) cannot be greater than one for a unit increase in n. We may 
establish this by considering the contrary. For that purpose, assume 


(46) W,(n) = W 


and 

(47) Wi(n+1) =W+e, c 2 2. 

Then by (47) we can distribute a total measure n + 1 over the points of 
PG(k — 1, 2) 


in such a way that the n-measure of the set of points outside any (k — 2)-flat 
is at least 2(W +c) +1 = 2W + 2c + 1, where 2c + 1 2 5. If now we 
choose any one point P having nonzero n-measure, and reduce its measure by 
unity, we shall have a total measure n distributed over the points of PG(k — 1, 
2) in such manner that the total measure of the set of points outside any (k — 2)- 
flat which contains P is at least 2W + 2c + 1 and the total measure of the 
set of points outside any (k — 2)-flat which does not contain P is at least 
2W + 2c. Hence, for all letters U; of the alphabet, w(U;) 2 2W + 2c 2 
2W +4>2(W +1) +1, so that Wi(n) = W + 1, contradicting (46). 

When n = k, the array of alphabet and cosets consists of the alphabet alone, 
so that there is just the single coset leader J = (000 --- 0), whence W = 0. 
As n increases in steps of one, W either stays constant or increases by unity. 
This step-function nature of W = W,(n) shows that n is a monotonically in- 
creasing function of W. 

Corresponding to a given W there are in general more than one value of n. 
The smallest n corresponding to a given W is a definite function of W, namely, 
the smallest value of W;'(W). We shall call this value n(W). Then n.(W) 
is a single-valued monotonically increasing function of W. Theorem 7 shows 





GEOMETRY IN INFORMATION THEORY 131 


that, for fixed k, the problem of finding W for given n is completely equivalent 
~ to the problem of finding n,.(W) for given W, that is, the minimum value of n 
for which W,(n) = W. Further, by considering n and W in the forms (32) 
and (35), respectively, one can treat the matter in terms of 7-classes. 

Taking W in the form (35), W = 2°°*t + e, let us consider the case e = —1 
for the general value (32) of n:n = (2° — 1)t + y, where now t > 0 (since W 
must be nonnegative) and —1 s 7 s 2° — 3. Fore = —1, W = 2% — 1, 
and an (n, k)-alphabet will allow all sequences of weight = W to serve as coset 
leaders if and only if 


(48) w(U;) > 2W +1 = 2*"t—1 forall j. 


Let us now define an n-measure over the points of the primary space as follows. 
(i) Ify = —1, 


(¢ for all ¢ except one, say i, 
(49) n= \ 
\¢— 1 fort = &. 
(ii) LOs ys 2 —3, 


wef + 1 for y distinct points P;, , Pi,, +--+, Pi, 


(50) 
\t for each of the remaining points of the space. 


Since by (20) a letter U is identified with 2°” points of the primary space, the 
n-measure (i) gives W(U;) = (t — 1) + (2°" — 1)t = 2°"t — 1 for all j, 
and the n-measure (ii) gives w(U,;) 2 2°"t for all j, thus satisfying (48) in 
each instance. Hence, for any value of n, k, the value (—1) can be attained 
for e. That is, 


(51) e 2 — 1 for all n, k. 


Now whenever (36) holds, then necessarily 
Dd w(U,) = (2 — 1)(2°"t + 2e + 1), 
j=l 


whence, by (31) and (32), and taking (51) into account, we have 


me et) 
(52) =-| ses| 2(2*— 1) |’ 


where [z] has its usual meaning “greatest integer not exceeding z.’’ In terms of 

n, the upper bound in (52) is already well known (cf., for example, Weinitschke 

[15] and MacDonald [12]); the refinement given by (59) below appears to be 

new. 

Since < 2" the quantity within brackets in (52) is bounded above by 
— 1/ 


-. on — 2), whence the most general boundary statement for ¢ is 


(53) -lses2?* -2. 





132 R. C. BOSE AND ROY R. KUEBLER 


Applying (53) to (35), wehaveW + 1 = 2° %+(e+1),0S (e+1) 48 
2** — 1, so that 
W+1 
'-|-eT 


is a well-defined single-valued function of W. Similarly, the bounds on ¥ in (32) 


show that 
[344 
2 — 1 
is a well-defined single-valued function of n. Moreover, if Wi(n) = W, 


a n+l Ww 1 
(54) [a t4|-+-[ 54" ]. 
Thus, for fixed k, the problem of finding W for given n has the equivalent forms: 
(i) given n, to find W = W,(n); 
(ii) given y, to finde = W,((2* — 1)t + y) — 2° *t = ex(y), say; 
(iii) given W, to find n = n.(W) = smallest value of n for which Wi(n) = W; 
(iv) given e, to find y = y:(e) = smallest value of y for which e(y) = e. 
One general result for all k follows immediately from the demonstration 
relating to the n-measure (49): 


(55) ve(—1) = —1 forallk. 


Further investigations concerning e(7) or yx(¢) need thus deal only with non- 
negative values of y and e. 

An immediate result of (55) is the complete specification of the classy = —1 
of (n, k)-alphabets which give W-error-correcting codes, where W = 2**t — 1. 
For y = —1,e = —1, Theorem 4 requires 


dod; = —2°* — (4 — 1)(-3) = 2" - 1, 


while Corollary 5.1 and Theorem 6 require that either all the d;’s be odd or the 
d,’s associated with the points of one (k — 2)-flat 2;, be odd and all other d,’s 
be even, so that there is only one possible D-measure: the D-measure which 
assigns d; = 1 to each of the 2°’ — 1 points of x;, and d; = 0 to each point 
outside z,;,. A unique n-measure follows from this D-measure by application 
of (40), which here is 


ny =t+1— (1/2*")> d; + 11. 


Since any (k — 2)-flat other than x;, meets 2;, in a (kK — 3)-flat, containing 
2** — 1 points, 


> d; = 2? -1, i ¥ iy, 


D 4; eS es A l, 


Fis 





GEOMETRY IN INFORMATION THEORY 


whence 


for i # &, 

(56) n; = (y = —1, any k). 
\¢—1 for ¢ = %, 

This is precisely (49), which is thus seen to be the unique design for W-error- 


correcting (n, k)-alphabets of the classy = —1. This is the Type ¢,l-alphabet 
of MacDonald [12]. 


For nonnegative e and y, the functions 7,(¢) and e,.(7) depend heavily on k. 
The case k = 2 is readily resolved. Here PG(k — 1, 2) is the projective line 
PG(1, 2), (k — 2)-flats are points, n = 3t + 7, W = t+ e, and 

-lsva2-3 
—1, 0, 1. The bounds given by (52) are 


2y — 3 
~ 8 Slee 
ises| 6 | 


so that e = —1 for all y. The n-measure admitting this value of e fory = —1 
is given by (56): (t, t, t — 1); obvious n-measures admitting e = —1 (that 
is, W = t — 1) fory = 0, 1 are (t, t, t) and (¢, t, t + 1), respectively. (The 
weight w(U,;) of the letter U; is the total n-measure of all points lying outside 
the U,-associated flat (here, point), so that in both the latter cases 


w(U;)22W+1=2%-1 


for all 7.) The relation of W and n in the case k = 2 may thus be summarized: 


‘ =3t+y, t>0,7 = -1,0,1; 
2: 


(57) k = 
W-=t+e, e = —1 for all y. 


The upper bound on e given by (52) is a necessary, but unfortunately not a 
sufficient, condition for the existence of an (n, k)-alphabet admitting e for 
given k and y. One might expect that if the bound were refined by taking into 
account all the conditions on an (n, k)-alphabet, the bound could be attained. 
A first refinement of the bound results from application of Corollary 5.1 and 
Theorem 6 to (39). We have by (39) 

e -Pra2t1-F4 


2(2* — 1) — 


and by Corollary 5.1 and Theorem 6 there must be at least one (k — 2)-flat 
in the dual space in which all points have odd D-measure, so that 


> da; 2 (2° — 1)(1), 


whence 


get, 
(59) =i tes [> - 3) +1. 


2 — 1 





134 R. C. BOSE AND ROY R. KUEBLER 


Consider the case k = 3. Here PG(k — 1, 2) is the projective plane PG(2, 2), 
(k — 2)-flats are lines, n = 7t + y, W = 2t + e. Let us fix attention on deter- 
mining y;(e). From (53) we have —1 S e S 0, so that the only possible values 


of e are (—1) and 0. We know from (55) that y;(—1) = —1, so that we need 
find only y;(0). For k = 3, e = 0, (59) gives 


yielding y = 3 as the smallest value of y potentially attainable, that is, 
ys(0) = 3. 


We demonstrate that the bound is actually attainable by exhibiting an n- 
measure which defines an alphabet allowing all sequences of weight 


W = 2t+e = 2 
to serve as coset leaders, given n = 7t + 3. Define the n-measure so that 
uN = t + l 


for three noncollinear points of PG(2, 2), and n; = ¢ for the remaining four 
points of PG(2, 2). Then the greatest total n-measure of any (k — 2)-flat 
(line) is 3t + 2, so that the total n-measure of the points lying outside any 
line is at least (7t + 3) — (3t + 2) = 4t + 1. That is, w(U;) 2 4t + 1 for 
all 7, whence obviously all sequences of weight 2 = W can serve as coset leaders. 
The relation of W and n in the case k = 3 may thus be summarized: 


—1,0,1,2,3,4,5 fort > 0, 


n= 7+ 7, 3,4,5 fort = 0; 


(60) 
W = 2t+e, 


ae 
_Jj—1 fory = —1,0, 1,2, 
ns 0 fory = 3, 4, 5. 


We observe that the upper bound in (59) came about by placing in (58) the 
smallest possible value of >. d; taking account of the congruence condition (42) 
for c = 1. This is the only pertinent value of c when k = 3. When k > 3, addi- 
tional congruence conditions must be brought to bear. Consider the case k = 4. 
Here PG(k — 1, 2) is the projective three-space PG(3, 2), (k — 2)-flats are 
planes, n = 15t + y, W = 4t + e. We set out to determine 7,(e) for nonnega- 
tive e; by (53) these values of e are 0, 1, 2. From (58) we have 


(61) Sy = 15(2e +1) + Dod;, 


and if we designate by min}. d; the smallest value of >; d; consistent with 
the congruence conditions on a D-measure, then a lower bound on 7 for given 
e is provided by 


(62) Sy = 15(2e + 1) + min} d;. 





GEOMETRY IN INFORMATION THEORY 


The congruence conditions (42) are 


(63.1) > d; = 2e + 1 (mod 2) for all lines o; , 


(63.2) > d; = 2e + 1 (mod 4) for all planes x, 
(63.3) > d; = 2e + 1 (mod 8). 
3 


The condition (63.1) is satisfied by means of Theorem 6: either all the d; are 
odd, or the d; for points of one plane x* are odd and all other d; are even. The 
smallest provisional >. d;, say >.*d;, is obviously given by the second al- 
ternative, assigning d; = 1 to each point of r* and d; = 0 to each of the re- 
maining points of the dual space. Let us call this measure the basic measure D*. 
For it we have >.* d; = 7. 

We shall consider the e-values in reverse order since the case e = 0 presents 


the most complications. When e = 2, the congruence conditions (63.2) and 
(63.3) are: 


(63.2a) - d; = 1 (mod 4) for all planes z, 
(63.3a) > d; = 5 (mod 8). 


We see that >-* d; = 7 does not satisfy (63.3a) and that a minimum addition 
of 6 must be made. When this addendum is distributed to point or points T; , 
the addition to any point must be a multiple of 2 in order that the congruence 
(63.1) be not disturbed. Let us distribute the addition by adding 2 to the D- 
measure of each point on a line / not lying wholly in x* (meeting x* in T*, say). 
Then there are four categories of planes with respect to D-measure: r*; the 3 
planes x” containing 1; the 3 remaining planes x" meeting x* in a line contain- 
ing T*; the eight planes x” meeting x* in a line not containing T*. The D-measures 
of these planes are as follows. 


2 4; 6(1) + 1(3) = 9 = 1 (mod 4), 

2 d; = 2(1) + 1(3) + 2(2) + 2(0) = 9 = 1 (mod 4), 
24; 2(1) + 1(3) + 4(0) = 5 = 1 (mod 4), 

Xd; = 3(1) + 1(2) + 3(0) = 5 = 1 (mod 4). 


Thus (63.2a) is satisfied, min>> d; = 7 + 6 = 13, and (62) gives 
8y = 15(5) + 13 = 88, 7 211. 


Moreover, the bound is attained by means of the D-measure specified in the 
above argument, for reference to (40) shows that the resulting n-measure 





136 R. C. BOSE AND ROY R. KUEBLER 


satisfies 


n= t+ 11 — 2(5) — {max} d; — 5] 


=t+1-—3/9—5)=t20 for all i. 
Thus, va(2) = 11. 
When e = 1, we must satisfy 


(63.2b) >> d; = 3 (mod 4) for all planes z, 
(63.3b) >. d; = 3 (mod 8). 
3 


The basic measure D* does not satisfy (63.3b); the minimum amount which 
must be added to >* d; is 4. We observe that D* does satisfy (63.2b), since 
«* has D-measure 7 and all other planes have D-measure 3. Hence, if the re- 
quired addendum 4 is assigned to a single point, the congruences (63.2b) will 
not be disturbed. Keeping in mind that an alphabet requires n; 2 0 for all 
i, and that by (40) n, is a decreasing function of oe d;, our aim is to keep 
max;)_,, 4; as small as possible. Hence we assign the additional measure 4 
to a point To lying outside x*. Then +* has D-measure 7, any other plane not 
containing To has D-measure 3(1) + 4(0) = 3, and any plane containing To 
has D-measure 1(4) + 3(1) + 3(0) = 7. Thus, min >> d; = 7 + 4 = 11, and 


(62) gives y 2 7. Moreover, the bound is attainable, since the D-measure con- 
structed above gives 
m2t+7— 2(3) — tmax) d; — 3] 


=t+1— 37 — 3] =t20 for alli. 
Thus, y4(1) = 7. 
For e = 0, we must satisfy 


(63.2c) >; d; = 1 (mod 4) for all planes z, 


(63.3c) > d; = 1 (mod 8). 


Again there must be an addition to >.* d; in order to satisfy (63.3). This time 
the minimum addendum is 2; however, if that additional measure is given to a 
point outside x*, then the D-measure of x* is 7 # 1 (mod 4), and if the addi- 
tional measure is given to a point T* of *, then any plane meeting x* in a line 
not containing T* will have D-measure 3(1) + 4(0) = 3 # 1 (mod 4). Hence 
addendum 2 must be ruled out, and the minimum addition to >-* d; must be 
considered to be 10. Keeping in mind that, for providing n; 2 0 for all i, it is 
desirable that max; >.,, d; be as small as possible, we try to spread the measure 
10 as thinly as possible over the planes of the space. Let us then increase by 2 





GEOMETRY IN INFORMATION THEORY 137 


the D-measures of 5 points such that not more than 3 are on any plane and 
exactly one (say T*) is on x*. For ease of reference we shall call these 5 points 
heavy points. There are now three categories of planes. 

(i) The plane x*. The D-measure of this plane is 6(1) + 1(3) = 9 = 1 
(med 4). 

(ii) Planes x“ meeting x* in a line containing T*. There are 6 such planes, in 
pairs, each pair forming with x* a pencil. Any pencil exhausts the points of the 
space. Hence, since not more than 3 heavy points are on a single plane, the 4 
heavy points outside ** must be distributed two each on the planes x” of any 
pencil of the type under discussion. Hence the D-measure of any plane x” is 
1(3) 4- 2(1) + 2(2) + 2(0) = 9 = 1 (mod 4). 

(iii) Planes x” meeting x* in a line not containing T*. There are 8 such planes, 
in pairs, each pair forming with x* a pencil. Since the 4 heavy points outside 


4 


m* are not on a single plane, there are () = 4 distinct planes containing 3 


heavy points each, and these must clearly be in 4 different pencils (since other- 
wise there would have to be 7 heavy points in the space). The third plane of 
each such pencil then contains one heavy point. Hence the D-measure of a 
plane x” is either 3(1) + 3(2) + 1(0) = 9 or 3(1) + 1(2) + 3(0) = 5; both 
values are congruent to one modulo 4. Thus (63.2c) is satisfied, 


min >, d; = 7 + 10 = 17, 


and (62) gives y 2 4. The bound is attainable since the D-measure specified 
above gives 


no 2=t+4—2(1) — max) d; — 1) 


=t+2—3/9-—1)=t20 for alli. 


Thus, y.(0) = 4. Since by (55) y«(—1) = —1, we may now summarize the 
relation of W and n in the case k = 4: 


—1,0,1,2,3,4,5,---,13 fort > 0, 
Pe OF 4,5, +>, 13 fort = 0; 
—lfory = —1,0, 1, 2, 3, 
0 fory = 4, 5, 6, 
1 for y = 7, 8, 9, 10, 
2 for y = 11, 12, 13. 


4t + e, 


Given k, y, and e, a y-class of W-error-correcting codes is obtained by setting 
up a 7-class of (n, k)-alphabets defined by a D-measure which satisfies Theorem 
4. The alphabet (22) is a member of such a class, specifically of a y-class 5(k = 3) 
since n = 5 = (2° — 1)(0) + 5. For such a class, (60) gives e = 0, and then 
Theorem 4 requires 


(i) 2X dj = 4(5) — (7)(1) = 18, 





138 R. C. BOSE AND ROY R. KUEBLER 
(ii) > d; = 1 (mod 2) for all lines , 


(iii) ni =t+3 — > d; — 1] 2 O for alli, 


the last inequality demanding for any D-measure admitting all values of t 


> d; <7 for all i. 


The congruence condition is satisfied through application of Theorem 6. The 
class of alphabets to which (22) belongs is given by the D-measure exhibited 
in Figure 2; the point measures d; are shown within parentheses. Application 
of (iii) readily verifies that n; = t,t + 2,t,¢+ 1,¢+ 1,t,t+ 1 fort = 1, 
2,---, 7, respectively. The n-measure for alphabet (22) is the special case 
t= 0. 

As is ciearly apparent in the foregoing example, there will in general be many 
D-measures satisfying Theorem 4 for given k, y, and e. It is then reasonable 
from such a class of D-measures to select as “optimum” that measure (or those 
measures) whose resulting code(s) will correct the maximum number of 


(W + 1)-tuple 


errors. An optimum alphabet is thus one which allows the maximum number 
of weight-(W + 1) sequences to serve as coset leaders. The alphabet (22) is 
such an optimum alphabet. The selection is based upon calculation of a quantity 
A for each competing D-measure, where A is termed the discrepancy and is de- 
fined as the number of weight-(W + 1) sequences which do not serve as coset 





GEOMETRY IN INFORMATION THEORY 139 


leader. The derivation of a formula which allows convenient calculation of A 
will be presented in a subsequent paper. Also available are complete tables of 


optimum designs for k = 2, 3, 4 (all n), arrived at by application of the fore- 
going notions. 

For increasing k, the establishment of n,.(W) and the orderly construction of 
D-measures become increasingly more complicated. Thus far no general pro- 
cedures are known. Some results in these matters, based on the geometric 
structure and theorems herein reported, will be presented in later communica- 
tions. 

REFERENCES 


{1] R. C. Boss, “On the construction of balanced incomplete block designs,’’ Ann. Eu- 
genics, V °. 9 (1939), pp. 353-399. 
[2] R. C. Boss, ‘Mathematical theory of the symmetrical factorial design,’’ Sankhya, 
Vol. 8 (1947), pp. 107-166. 
[3] R. C. Bose anp R. C. Burton, ‘‘On a problem in Abelian groups and the construction 
of fractionally replicated designs’ (Abstract), Ann. Math. Stat., Vol. 28 (1957), 
p. 533. 
[4] L. Catasi ann H. G. Haereui, On Hobbs’ Code, Technical Memorandum No. 14, Parke 
Mathematical Laboratories, Carlisle, Massachusetts, June 1957. 
(5) Perer Exias, “Error-free coding,”’ Trans. I. R. E. Professional Group on Information 
Theory, PGIT-4 (1954), pp. 29-37. 
[6] Perer Extas, ‘Coding for noisy channels,’’ JRE Convention Record, Vol. 3 (1955), Part 
4, pp. 37-46. 
[7] A. B. Fontaine anv W. W. Pererson, On Coding for the Binary Symmetric Channel, 
Research Report RC-43, IBM Research Center, International Business Ma- 
chines Corp., Poughkeepsie, N. Y., February 1958. 
[8] E. N. Grupert, “A comparison of signaling alphabets,’’ Bell System Technical J., 
Vol. 31 (1952), pp. 504-522. 
{9} M. J. E. Gotay, “Binary coding,’’ Trans. ]. R. E. Professional Group on Information 
Theory, PGIT- (1954), pp. 23-28. 
(10) R. W. Hamuine, “Error detecting and error correcting codes,’’ Bell System Technical 
J., Vol. 29 (1950), pp. 147-160. 
{11] S. P. Luoyp, ‘‘Binary block coding,’’ Beli System Technical J., Vol. 36 (1957), pp. 517- 
535. 
{12} J. E. MacDonatp, Jr., Constructive Coding Methods for the Binary Symmetric Inde- 
pendent Data Transmission Channel, MEE Thesis, Syracuse University, Janu- 
ary 1958. 
[13] I. S. Reep, ‘‘A class of multiple-errcr-correcting codes and the decoding scheme,”’ 
Trans. I. R. E. Professional Group on Information Theory, PGIT-4 (1954), pp. 
38-49. 
{14] Davin Stepian, “A class of binary signaling alphabets,’’ Bell System Technical J., 
Vol. 35 (1956), pp. 203-234. 
(15) H. Wernirscuxe, On Some Upper Bounds of Importance in Slepian’s Theory of Coding, 
Technical Memorandum No. 15, Parke Mathematical Laboratories, Carlisle, 
Massachusetts, June 1957. 





A NECESSARY AND SUFFICIENT CONDITION FOR THE 
EXISTENCE OF CONSISTENT ESTIMATES 


By Lucren LeCam! anp Lorraine Scuwartz’ 
University of California, Berkeley 


1. Introduction. Let % be an arbitrary set and let @ be a o-field of subsets of 
x. Let @ be the family of all probability measures on @. Let © be a topological 
space which is homeomorphic to a subset of the cube K = J“°, the product of 
a countable family of copies of the interval J = [0, 1). 

Let D be a subset of @ and let ¢:P -+ o(P) be a function defined on D and 
taking its values in 0. 

Let X,, X2,-::, Xn,°°: be a sequence of independent identically dis- 
tributed variables taking their values in X and distributed according to some 
P ¢ D. Our purpose is to give a necessary and sufficient condition for the existence 
of consistent estimates of the function ¢(P). 

More precisely, the problem can be described as follows. For each integer n 
let 2¢" be the product of n copies of 9%, let @” be the o-field product of n copies 
of @ and let P” be the measure defined on @” by the product of n copies of P. 

Let F be an arbitrary family of subsets of D. If 6 and @ are elements of the 
cube K let 6; and 6; be their ith coordinates in K and let p(6, &) be the distance 


l 


(1) (0,0) = >—|0,—6;}. 


ai | 


By assumption the distance p defines on 0 C K its original topology. 

Let ® denote the o-field of Borel subsets of 8 (or K). We shall say that ¢ is 
$-consistently estimable if there is a sequence { 7',} with the following properties: 

(1) The function 7, is a measurable map from {%", @"} to {O, @}. 

(2) For every « > 0 and P ¢ D let V(P e) be the sphere set of elements of 
© whose distance to ¢(P) is not larger than «. Then for every « > 0 and every 
F ¢ § the quantity 

sup P"(T, 2 V(P, €)] 
Per 
tends to zero as n tends to infinity. 

The explicit purpose of the present paper is to give a characterization of the 
functions ¢ which are S-consistently estimable. 

The terminology and results of a topological nature used in this paper can be 
found in either [1] or [2]. The concept of a precompact uniform structure, neces- 


Received March 23, 1959; revised August 3, 1959. 

1 Fellow of the A. P. Sloan Foundation. 

? This paper was prepared with the partial support of the Office of Ordnance Research, 
U. 8. Army under Contract DA-04-200-ORD-171, Task Order 3. 


140 





EXISTENCE OF CONSISTENT ESTIMATES 141 


sary to the main result of the paper, corresponds to the notion of proximity 
introduced by Efremovicz [3] and may be replaced by it (see also I. 8. Gal [4]). 
For a comparison of this to the more usual topologies and distances used the 
reader is referred to section 3. 


2. A characterization of S-consistently estimable functions. On the space 
© of probability measures on @ define a uniform structure U, by the vicinities 
of the diagonal of ® X & which are of the form 


W = Wifi fo,-+- Se} 
(2) ( ’ ) 
={ (PQs) [Hap — [fpaQ <i; §= 12> bE, 
where the f;’s are @"-measurable bounded numerical functions defined on %”. 
Let U be the uniform structure obtained by taking all vicinities of the preced- 
ing type, for all values of n. 


TuHeoremM 1: The function ¢ is 5-consistently estimable if and only if there is a 
sequence |g} of functions from D to K such that 
(a) Each gy is uniformly continuous for the structure U on D and the structure 
defined by p on K. 
(b) The sequence {¢,} converges to ¢ uniformly on the elements of S. 


Proor: Suppose that {7',} is a consistent sequence of estimates of ¢ converg- 
ing uniformly on the subsets of $. For each n let ¢,(P) = E[T, | P| be the point 
of K whose coordinates are the expectations for P” of the corresponding coor- 
dinates of T, . Clearly ¢, is {U, , p} uniformly continuous on @. In addition, the 
coordinates of T, converge in probability to those of ¢(P) so that ¢,(P) con- 
verges to ¢(P). It is also clear that the convergence is uniform on the elements 
of ¥. Conversely, let {8} be a sequence of {U, p} uniformly continuous functions 
from D to K such that 8,(P) converges to ¢(P) for each P ¢ D. 

For each integer m, one can find integers N(m) and k(m) and functions 
fm.iij = 1,2,---, k(m), which are @”’” measurable and bounded and such 
that P ¢ D and Q ¢ D and 


(3) up| f fa. ap“™ — | tasar™ <1 
implies 
(4) pl8-(P), 8(Q)] < 1/m 


for every r S m. For a pair (m, j) let 


(5) || fm.g || = sup {| fu.s(2) | 2 € am 


Let {Z,; s = 1, 2,--- , S(m)} be a sequence of independent variables taking 
their values in {sc¥“”’, @”“”’}. Chebyshev’s inequality implies that 





LUCIEN LECAM AND LORRAINE SCHWARTZ 


k(m) 8\m) 

> Prob¢ dD Un.i(Z.) — Efm,s(Z.)] <3) 

a 5 & J 

(6) 16 k(m) 
= S(m) j=l 

Therefore, there exists an integer S(m) such that the left-hand side of the fore- 

going inequality is inferior to m~’ whatever may be the distribution of Z, . 

Without loss of generality one can assume N(m + 1) 2 N(m) and 


S(m +1) 21+ S(m). 


Let then v(m) = N(m)S(m) and let m(n) be the integer m for which »(m) S 
n < v(m + 1). Let Z, be the N(m)-tuple defined by 


(7) Z, _ { X (e—1) v(m) +1 ’ X (e-1) N(m) +2 . ce ee X.wim)} - 


Note that for U the space @ is precompact (= totally bounded). Hence it is 
possible to find a finite subset D,, = {Pm ;1 = 1,2, --- , L(m)} of D such that 
if P e D there isa P,,.1 € D, for which 
(8) up | Elfm.j| Pmt] — Elfm.s | P}| < 2. 


j=1,2,- 


rr 2 
fost * 


Consider the quantity 


1 S(m) 


(9) ¥( Pana) = 5 | | sem) | 2 Sm,i( j = El fm.i | Po i] 


and let P,, be the first element P,,.; of D,, which is such that 
y(P,) = min y(Pm,:). 
l 

In this fashion to each point n of x” one has associated an element P,(x) 
of D,, . The function so defined takes only a finite number of values and the sets 
of constancy of P, are @’’”’-measurable. 

In addition, if P is the distribution from which the sequence {X,; ;i = 1,2, --- , 
v(m)} is obtained, then 


S(m) 


(10) sup som 5 Dy fms Ze) — Ems(Zs) | PL <4 


‘ 
except for cases of probability inferior to m™. 
There is a P,..1, € D» such that 


(11) sup | E(fm.s|P) — E(fm.j| Pm.) | < 4 


Consequently, 


S(m) 


(12) sup | dy Ini Z «) — E(fu.i| Pa) | < } 
I 


<a 3 





EXISTENCE OF CONSISTENT ESTIMATES 


and finally 
(13) sup | Elfn.j| Pal — Elfm.s| P| <2 
7 
except for cases having a probability inferior to m™. 
By definition of the functions f,,,; this inequality implies 


(14) Pd plBminy( Pa), Baay(P)) > | } <x 
\ m m 


for every P ¢ D. 


Take 7, = Bacn)(P,) and let 7, be any point of @ such that 
(15) ol’. , T] S inf {p(T ,0);6¢@ } + - 


Since 7, takes only a finite number of values, the function 7’, will also be @"- 
measurable provided that to any given value of T. one always associates the 
same value of T, . 

By construction we have: 

( \ 
(16) P'4 ols (PI > ray + elBaen(P) 96 P  < 5, 
m(n) m 


for every P ¢ D. Therefore, 


' 1 I 
p(T, ’ T.) s so + a ee + pBacn)(P), ¢(P)] 
n m(n) 
except for the case of probability inferior to m™ where the inequality holds within 
the brackets of the preceding expression. Finally 
2 


( 
(8) Pé olT.,e(P)) >14+ 
n m(n) 


+ 2olamcn(P), (P| < 


for every P ¢ D. 

Since p[8nin)(P), ¢(P)] converges to zero uniformly on the sets F ¢ § this 
completes the proof of the theorem. 

REMARK 1. Suppose that D is the union of an increasing sequence {A,} of 
subsets such that 

(1) Each element of $ is contained in a set A, . 

(2) There is a sequence of functions yg such that ¢, is defined and uniformly 
continuous on A,q) , and g converges to ¢ uniformly on the elements of F. 

(3) v(k) + @~ ask— o~, 

Th» the function ¢ is S-consistently estimable. To prove it, note that U is 
precompact. Hence D can be completed to a compact space D. If ¢ is defined 
and uniformly continuous on A,q) then g can be extended by continuity to the 
closure 4,4) of A,qa) in 5. However, since 9 is compact, hence normal, one can 
then extend ¢, to a function & which is defined and continuous on 9 hence a 
fortiori on the whole of D. 





144 LUCIEN LECAM AND LORRAINE SCHWARTZ 


It is clear that {¢,} converges to ¢ uniformly on the elements of 5. 

Remark 2. The structure U enters in Theorem 1 only by the space of uni- 
formly continuous bounded numerical functions it determines on D. Any other 
structure giving rise to the same space of uniformly continuous functions could 
be substituted for U. 


3. Relation between various types of continuity. The preceding theorem in- 
volves the uniform continuity of functions ¢, with respect to a uniform structure 
“ which is not very easily accessible. For this reason some remarks on the struc- 
ture ‘U and its relation to other structures are in order. 

One can define on ® a norm, called the L;-norm by the expression || P — Q || = 
sup | f fdP — f fdQ| the supremum being taken over the set of @-measurable 
functions f which are bounded by (—1) and (+1). If \ = P + Q this can also 
be written 


PD. Oc dP dQ 
iP- ane | |E-F 


It is easily seen that the structure U(9) defined by this norm is finer than U. 
This gives the following corollary. 

Coro.iary 1. For the function ¢ to be S-consistently estimable it is necessary that 
there be a sequence {gx} of functions from D to K with the following properties: 

(1) {gx} converges to ¢ uniformly on the elements of &. 

(2) Each ¢ is uniformly continuous for the structure U(IL) on D and the struc- 
ture defined by p on K. 

For the S-consistent estimability of ¢ it is sufficient that the ¢,’s be uniformly 
continuous with respect to one of the structures U, . 

The above corollary may be used to show that, for certain hypotheses, con- 
sistent tests do not exist. For instance, iet D be the family of distributions having 
densities with respect to the Lebesgue measure on the real line. There do not 
exist consistent tests of the hypothesis that the expectation of the distribution 
is finite. The function to be estimated is the indicator of the set representing 
the hypothesis tested in D. It can easily be seen that this function is not a point- 
wise limit of a sequence of functions which are continuous for the norm 2. 

When the space & is the real line (or a Euclidean space) it is customary to 
define distances, and consequently uniform structures on @ by taking either 


(19) 8(P, Q) = sup| P[X s z] — QX s 2] | 


or 
(20) A(P, Q) 
= inf{f-a+ P(X Sz-—a) S$ Q(X S 2) S P(X S r+) +a). 


The distance 6 is referred to as the Kolmogorov-Smirnov distance and \ as the 
Paul Lévy distance. 





EXISTENCE OF CONSISTENT ESTIMATES 145 


Denote by U(4) and U(A) the corresponding uniform structures. Further, let 
3 be the topology associated with U and let 3(9), 3, , 7'(8), 3(A) be the topologies 
associated with the other structures just defined. Finally, let A(resp. A(%), 
A, , 4(é), A(A)) be the spaces of bounded uniformly continuous numerical func- 
tions for the structure U(resp. U(I7), etc.). 

It is well known that the following inclusions hold and are usually strict. 


(MN) DIDI, PD HS) D HA) 
(21) uw(rm) > UD 4U, 


\u(a) D> U(s) D> UA) 


The structures U, and U(é) are not comparable. Similarly the structures U, 
and U(A) are not comparable. However, it will be shown further on that 


(22) A(M) D AD A(8). 


It does not seem to be generally true that 4, > A(é) or that A, D A(A). Let us 
show this for A; and A(é), for the sake of completeness. 

Consider the family D of all distributions P on the real line R which are such 
that there is some point zfor which P{z} 2 2/3. Let ¢(P) be defined by ¢(P) = 
sup, P{x}. Then ¢ is uniformly continuous on D for the structure U(s). Let 
{f;,j7 = 1, 2,---, m} be a finite family of bounded @-measurable numerical 
functions. For every z ¢ R let F(x) be the point F(z) = {f;(z), fo(z), «-+ ,fm(x)} 
in the m-dimensional Euclidean space &,, . For every « > 0 there exist two points 
z and y of R such that 


(23) | F(z) — F(y)| = men fz) — Sly) | <e 


Indeed F(R) is either finite or an infinite set having at least one accumulation 
point. Let P be a measure giving mass 5/6 to z and let Q be the measure obtained 
from P by removing a mass 1/6 at z and placing it at y. Then 


(4) |o(P)—(Q)|24 and | f sar — f sa0| ste 


To show that the conditions of uniform continuity with respect to U cannot 
usually be replaced by mere continuity with respect to the topology generated 
by U, consider the following example. 

Let X be the interval [0, 1] and let @ be the o-field of Borel sets of x. Let D 
be the class of probability measures D = {6, ; z ¢ X} where 4, is the measure 
giving mass unity to the point z. The uniform structures U and ‘U, coincide on D. 
Further, identifying XX and D one can easily verify that the U or U, uniformly 
continuous bounded numerical functions on D are precisely the @-measurable 
bounded numerical functions on &X. In particular, the pointwise limit of a se- 
quence of uniformly continuous functions is uniformly continuous if it is bounded. 
However, the topology associated to U is discrete, so that every numerical func- 
tion on & is continuous. 





146 LUCIEN LECAM AND LORRAINE SCHWARTZ 


In other words, one should expect that there will exist continuous functions 
which are not pointwise limits of sequences of uniformly continuous functions. 
It seems also plausible that, in general, there will be functions which are U-uni- 
formly continuous but not limits of sequences of U,;-uniformly continuous func- 
tions. 

However, it is easily seen that if all the elements of D are absolutely continu- 
ous with respect to a given probability measure yu, then every ‘U-uniformly con- 
tinuous function on D is the pointwise limit of a sequence of U,-uniformly con- 
tinuous functions. 

Since U,; is much more manageable than U the following theorem is of interest. 

Proposition 1. Let A be a subset of D which is relatively compact in & for the 
topology induced by U, . Then on A the structures U and U, coincide. 

Proor. The proof depends on a well-known theorem of Dunford and Pettis 
(see [5] and also [6]) which states that A is relatively compact in @ if and only 
if one of the following two equivalent conditions is satisf‘ed: 

(1) There is a finite measure » such that for every « > 0 there isa é > 0 for 
which u(A) < éimplies P(A) < «¢ for every P ¢« A. 

(2) Every sequence {P, ; k = 1, 2, ---} of elements of A contains a subse- 
quence which converges to an element of @. 

Let A be the closure of A in @ for the structure U, . If {P;} is a sequence of 
elements of ® which converges to P» for U; then P? converges to P> for the 
structure U, because of the equicontinuity described in Condition (1) above. 
Hence A is also compact for U, and therefore for U itself. But % being compact 
and finer than U, and U,; being separated, U and U,; must coincide on the set A 
hence on A. 

From this result we can deduce the following. 

Proposition 2. Let { A, ; k = 1, 2, ---} be a sequence of subsets of D such that 


> = UaA,. 
k 


Assume that each element of § is contained in a finite union of sets A, . 

If each one of the sets A, is relatively compact for U, in @ then uniform con- 
tinuity with respect to U can be replaced by uniform continuity with respect to U, in 
the statement of Theorem 1. 

If each one of the sets A, is compact for U, then uniform continuity with respect 
to U can be replaced by continuity with respect to U, in the statement of Theorem 1. 

Proor. First one can assume that A, C A,4; . To prove the second statement 
let H be the space of functions from D to K which are U-uniformly continuous 
on D. If 8, is a continuous function from D to K there is an element a, of H such 
that pla.(P), 8.(P)] < k” for every P ¢ A, . This is easily seen by application 
of the Stone-Weierstrass theorem ({1], chap. 10, p. 55 or [7], p. 9) to the coor- 
dinates of 3, in K. 

Consequently, if {8,} converges to ¢ uniformly on the element F of § and if 
F c A, then a converges to ¢ uniformly on the elements of &. 

The first statement is a consequence of Proposition 1 and of the remark made 
after the proof of Theorem 1. 





EXISTENCE OF CONSISTENT ESTIMATES 147 


To show that the families D which satisfy the conditions of Proposition 2 are 
not exceedingly rare let us mention the following. If A is a set of probability 
measures which are all absolutely continuous with respect to a given finite meas- 
ure » then A is relatively compact if all the densities dP/dy are bounded by the 
same number M or more generally if they are bounded by a given yz integrable 
function. Hence, if D consists of probabilities whose densities with respect to a 
finite measure » are bouuded (not uniformly) then D is a union D = U 4, of 
U, relatively compact sets. 

Another example is the following. Suppose that ~ is a parameter taking its 
values in a subset S of a Euclidean space &. Assume that S is the intersection of 
an open set of & with a closed set of &. To each  ¢ S make correspond a proba- 
bility measure P; on the real line. Assume that: 

(1) If & — & then the distribution functions of the P;, converge to the distri- 
bution function of P;, at all points of continuity of the latter. 

(2) For each & ¢ S there is a neighborhood V(£) of ¢ in S a finite measure 
and a y,-integrable function f; such that dP, /du; S f; for every t’.e V(é). 

Let C be a compact subset of S and let A(C) = {P; : — e C} then A(C) is com- 
pact in @ for U, . Since S is a union of a sequence of compact sets C, the set 
D = {P; ; = ¢ S} isa union of a sequence of compact sets. 

As an example of a different phenomenon, suppose that all the P; defined 
above, instead of satisfying (1) and (2) satisfy 

(3) If &, — & and if » is a o-finite measure with respect to which all the 
{Px ;n = 0,1, 2, ---} are absolutely continuous, then dP;,/du tends to dP;,/dy 
in » measure. Under such a stringent restriction, it follows from Scheffé’s theorem 
[8] that if C is compact then A(C) = {P; ; — e C} is compact in the sense of the 
[,-norm. 

From these considerations one can deduce the following result: 

Proposition 3. Let © be a subset of a Euclidean space &. Assume that to each 
6 € © there corresponds a probability measure Ps on the real line, in such a way that 
Py, = Po, implies 6, = 6, . Furthermore, assume that the following conditions hold: 

(1) If 0, converges to 6, then the distribution functions of the Py, converge to. the 
distribution of Ps, at all points of continuity of the latter. 

(2) For each 6 ¢ © there is a neighborhood V(@) of 6 and a finite measure js such 
that for every « > 0, there is a 6 > O for which the inequality ps(A) < 6 implies 
P(A) < efor t= « V(@). 

(3) © is the intersection of an open set of & with a closed subset of &. 

(4) Each element of $ is contained in a compact subset of 9. 

Let 6 — ¢( 6) be a numerical function defined on O. 

In order that there exist an S-consistent sequence of estimates of ¢ it is necessary 
and sufficient that ¢ be the limit S-untformly of a sequenre of continuous functions 
of 0. 

In particular, if F is the family of all points of 0, there exists a consistent 
estimate of ¢ if and only if ¢ is of the first Baire class on 9. 

Proor. Since the correspondence 6 «+ P, is one to one the function ¢(@) can 
also be considered as a function defined on D. Let ¥(P) = ¢f[0(P)]. If {yu} is a 





148 LUCIEN LECAM AND LORRAINE SCHWARTZ 


sequence of continuous functions defined on D and converging uniformly to 
¥(P) on the images of the elements of F then {g} defined by g(@) = yu( Pe) 
converges $-uniformly to y. Hence the necessity of the condition. 

To prove the sufficiency, let {C,} be a sequence of compact subsets of © such 
that C, C C,a4, and such that every compact subset of © be contained in a C, . 
Let A, = {Po ; 6 €C,} be the image of C, in D. 

Since the function 6 — P, is continuous and one to one, the inverse function 
P — P, is continuous on each one of the compacts A, . 

Let {gx} be a sequence of continuous functions of @ converging to ¢ uniformly 
on the elements of 5. Define y by ¥%u(@) = ¢[0(P)]. Then y converges to ¥ 
uniformly on the images of the elements of $ and y is continuous, hence uni- 
formly continuous on each A, . The result follows by the remark made at the 
end of the proof of Theorem 1. 

As an example of application consider the case where § is either the family 
of compact subsets of © or more generally any family of compact sets such that 
each @ ¢ @ is interior to an element F% of &. 

Then a function ¢ is $-consistently estimable if and only if it is continuous on 
8. 

Another result obtainable directly from Proposition 1 is the following. Let 
D be relatively compact in @ for U, . Let A and B be two disjoint subsets of D. 

A sequence of tests of the hypothesis A against the alternative B is a sequence 
of measurable functions 7, from {9", @"} to the interval [0, 1]. The sequence is 
called uniformly consistent if ¢,(P) = E[T, | P] converges to zero on A and to 
one on B, the convergence being uniform in P. It is clear that the existence of a 
uniformly consistent sequence of tests is equivalent to the existence of a uniformly 
consistent estimate of the function ¢ equal to zero on A and to one on B. There- 
fore, such a sequence 7’, will exist if and only if the indicator of A in D is U;- 
uniformly continuous, that is, if there is a finite family {f; ;7 = 1, 2, --- , m} of 
@-measurable bounded functions on X such that 


up| { ap - f a0 <1 


implies that either both P and Q are elements of A or both are elements of B. 

Another type of restriction on D under which the structure U can be replaced 
by a somewhat more accessible structure is the restriction considered by W. 
Hoeffding and J. Wolfowitz in [9]. 

To simplify we shall present this condition only for the case of the line, al- 
though the argument is given by these authors for an arbitrary Euclidean space. 
Let P and Q be two probability measures on the line and let f and g be the densi- 
ties of P and Q with respect tow = P + Q. 

If there are intervals J; ,i = 1, 2, --- , m such that f-g has a constant sign on 
each J; , let V’ = U;J;. Let J[P, Q; «] be the smallest value of m for which 
min [P(V), Q(V)] s « if such a value exists. Otherwise let J[P, Q; «| = ~. 





EXISTENCE OF CONSISTENT ESTIMATES 


A family D satisfies the H-W condition if for every « > 0 the quantity 


(25) sup{J[P,Q;4; PeD, QeD 
is finite. 

Hoeffding and Wolfowitz show that most of the usual parametric families of 
univariate distributions satisfy the H-W condition. Furthermore, these authors 


have shown that, 5 representing the Kolmogorov-Smirnov distance, the in- 
equality 


(26) ||P — Q)| s 4J[P, Q; (P,Q) + 2 


always holds. This inequality implies that, on a set D satisfying the H-W con- 
dition the distance 6 and the norm || P — Q || are equivalent. In other words, the 
H-W condition implies that U(9l) = u(é). 

The classical result of Glivenko and Cantelli implies that, P, denoting the 
empirical distribution of a sample of n independent variables having a distribu- 
tion P, the distance 6(P,, P) converges to zero in probability uniformly for 
P e @. This, in turn, by application of Theorem 1, shows that, whatever may be 
®D a function ¢ from D to K which is U(é) uniformly continuous on D is also 
U-uniformly continuous on D. 

Conversely, if D satisfies the H-W condition, the space of functions which are 
U(N)-uniformly continuous coincides with the space of functions which are U-uni- 
formly continuous and with the space of functions which are U(6)-uniformly con- 
tinuous. Hence, under the H-W condition, the structure U can be replaced by either 
U(I) or U(b) in Theorem 1. 


4. Historical note. Several authors have obtained results on the existence of 
consistent estimates. Here are some incomplete references on the subject. 

A study of the existence of consistent tests has been made for particular cases 
by Mrs. A. Berger in [10] and [11]. The subject was investigated further by C. 
Kraft in [12] without making assumptions of independence and identity of dis- 
tributions. The recent paper [9] by W. Hoeffding and J. Wolfowitz contains a 
very deep study of the concept of distinguishability of sets of probability meas- 
ures. These authors place themselves in a framework where the variables are 
independent and identically distributed. Their concept of finite distinguisha- 
bility corresponds roughly to the concept of existence of uniformly consistent 
tests. 

Theorems of the same nature as Theorem 1 have been obtained by J. L. 
Hodges, Jr. for the existence of consistent tests and by C. Stein for the existence 
of consistent estimates, and presented at a meeting of the Institute of Mathe- 
matical Statistics in Boston in 1952. These two authors restricted themselves to 
pointwise convergence. The result mentioned by C. Stein could be quoted as 
follows: ‘‘For a function ¢ on D to be pointwise consistently estimable it is 
necessary that it be the limit of a sequence of U(91)-uniformly continuous func- 





150 LUCIEN LECAM AND LORRAINE SCHWARTZ 


tions and it is sufficient that it be the limit of a sequence of U(é) uniformly 
continuous function.”’ Thus, under the H-W condition C. Stein’s result coincides 
with ours. 
REFERENCES 
{1} N. Bournsaxi, Elements de Mathématique—Premiére Partie. Les Structures Fonda- 
mentales de l’ Analyse, Livre 3, Topologie Générale, Hermann et Cie., Paris, 1940- 
1953. 
[2] Jonn L. Keiiey, General Topology, Van Nostrand, New York, 1955. 
[3] V. A. Erremovicz, ‘The geometry of proximity I,’’ Mat. Sbornik N. S., 31 (73) (1952), 
pp. 189-200. 
[4] I. 8. GAL, Amer. Math. Soc. Notices, Vol. 6 (1959), Abstract 554-23, p. 140. 
[5] Nevson Dunrorp anv Jacos T. Scuwartz, Linear Operators—Part I. General Theory, 
Interscience Publishers, Inc., New York, N. Y., 1958. 
[6] A. Grornenpieck, “Sur les applications faiblement compactes d’espaces du type 
C(K),”’ Canadian J. of Math., Vol. 5 (1953), pp. 129-173. 
[7] Lynn H. Loomis, Abstract Harmonic Analysis, Van Nostrand, New York (1953). 
{8] Henry Scuerrfé, “A useful convergence theorem for probability distributions,’’ Ann. 
Math. Stat., Vol. 18 (1947), pp. 434-438. 
{9} Wassity Hoerrpine ann J. Wo.irow1Tz, ‘‘Distinguishability of sets of distributions,” 
Ann. Math. Stat., Vol. 29 (1958), pp. 700-718. 
[10] Acres Bercer, “On uniformly consistent tests,’’ Ann. Math. Stat., Vol. 22 (1951), pp. 
289-293. 
[11] Aengs Bercer, ‘On orthogonal probability measures,’’ Proc. Amer. Math. Soc., Vol. 
4 (1953), pp. 800-806. 
{12} Cuartes Krart, ‘Some conditions for consistency and uniform consistency of sta- 
tistical procedures,’ Univ. of Calif. Publ. in Stat., Univ. of California Press, 
Berkeley, Vol. 2 (1955), pp. 125-142. 





THE DISTRIBUTION OF THE LATENT ROOTS OF THE 
COVARIANCE MATRIX 


By Avan T. James! 
Yale University 


1. Summary. The distribution of the latent roots of the covariance matrix 
calculated from a sample from a normal multivariate population, was found by 
Fisher [3], Hsu [6] and Roy [10] for the special, but important case when the 
population covariance matrix is a scalar matrix, 2 = oJ. By use of the repre- 


sentation theory of the linear group, we are able to obtain the general distribu- 
tion for arbitrary 2. 


2. An integral expression for the distribution. Suppose the sample consists of 
N observations from a normal %-variate population with covariance matrix 2. 
After the usual orthogonal transformation to eliminate the sample means, we 
have ak X n matrix X,n = N — 1, 


distributed as 
(1) (2e)~™ etr (—427 XX’) | 2°” T] dai; 
7 


where the symbol etr stands for the exponential of the trace of a square matrix. 
If A = (a,;), then etr (A) = exp (Gu + Gee + -+- + Gan). Our object is to find 
the distribution of the latent roots 4; , &%, +--+ , t of the matrix XX’, (4; 242 
+++ >t). 

By expressing X as a function of the ¢; and other variables and integrating 
with respect to the latter, Fisher, Hsu and Roy showed that 


(2) I IT dz; = ¢ TI «)**” IT (t; — tj) [I dt, 


ther variables +.) 


where 


k k 
(3) e=2™]] A(n —i 4+ 1) J] AG), 
t=-] 


i=l 


Received May 11, 1959 
1 This work was initiated in the Division of Mathematical Statistics of the Common- 


wealth Scientific and Industrial Research Organization, Australia, and completed at Yale 
University. 


151 





152 ALAN T. JAMES 


In the special case when 2 = oJ, , the density in (1) is a function of the ¢, 
alone, and thus they obtained the distribution as 


h+o+°---+4 (n—k—-1 
( a +h +h) et : ' 
4) (2m) imkgnt exp ( 2a? (II t ) Ul (t t;) II dt 
For general 2, the density 
(5) etr (—42°' XX’) 


is no longer a function of the t;. This is equivalert to saying the (5) is not in- 
variant under congruence transformation by the orthogonal group O(k) of 
k X k orthogonal matrices, H, 


XX’ — HXX'H’ H ¢O(k). 


However, by an argument similar to the one given for the derivation of the 
noncentral Wishart distribution in James [7], one sees that the distribution of the 
t; is not altered if the density function in the initial distribution (1) is symme- 
trized, by which we mean that the function (5) occurring in (1) is replaced by 
its average 


(6) i etr (—}2"'H’'XX’H) d(H) 


with respect to the invariant measure, d(H), on the orthogonal group, O(k). The 
invariant measure is normalized to make the total measure of O(k) unity. The 
symmetrized function (6) is now a function of the ¢; , and, putting XX’ = A, we 
have the 

TxHeoreM 1. The general distribution of the latent roots t, , --- , ty of the matriz A 
of sums of squares and products about the means, of a sample of N = n + 1 ob- 
servations from a k-variate normal population with covariance matriz = is 


Ger zr I etr (—32"'HAH’) d(H) 
Ok) 


k k 
Tae’ TT — 6) TT ae 
t<J 


(7) 


where the constant c is given by (3) and d(H) is the invariant measure on the orthog- 
onal group O(k). The integral is a symmetric function of the latent roots of = and 
the latent roots t, a Ate, t, of A. 

Formulae for the distribution, similar to (7), are well known. The real problem, 
to which we now turn, is to evaluate the integral which is of the form 


(8) [ etr (BHAH’) d(H) 
ork) 


where A and B are k X k symmetric matrices. Our results, as summarized in 
Theorem 2, are an expansion for the integral in a series of zonal polynomials 
which are given, up to fourth order, in the appendix. 





LATENT ROOTS 153 


One possible method of finding the integral would be to expand the integrand 
in @ power series and calculate the integrals of the resulting monomials in the 
elements of H by using the generating function given by James [8]. This method 
was used to check our results up to third order but beyond this it became too 
cumbersome. 


The use of the theory of spherical and zonal functions is far more powerful 
and much more enlightening. 


3. Spherical and zonal functions. The initial distribution (1) of a normal 
multivariate sample is highly symmetrical and this provides the clue to the 
evaluation of the integral (8). The distribution (1) is clearly invariant under the 
group O(n) of orthogonal matrices H of order n acting upon X from the right 


(9) X — XH H ¢ O(n). 


It is also invariant under the linear group G(k) of all real k X k nonsingular 
matrices L acting upon X from the left and upon 2, simultaneously, by con- 
gruence transformation 


(10) X— 1X 
(11) =— LIL’ 


Transformations (10) and (11) imply that A = XX’ is transformed by congru- 
ence transformation cogrediently, and the information matrix =~’, contra- 


grediently, 
(12) A — LAL’ 
(13) gin ft ss 


The function tr (2~’X X’) upon which the probability in (1) depends and the 
volume element 


L e G(k) 


| z rT dz;; 
+9 


are invariants under the transformations. tr(=~'A) is the sum of the latent 
roots A; cf the determinantal equation 


(14) |AZ — A| = 0, 


which, apart from their order, are a complete set of invariants of the pair of posi- 
tive definite matrices 2 and A under their simultaneous transformation in (11) 
and (12). This is proved by the fact that one can choose an L ¢ G(k) such that 


LIL’ = I, 
(15) LA w= Aw} "+ 
he 


Hence \; , --* , Ax are a complete set of invariants. 





§ 
154 ALAN T. JAMES 


One might think that, having obtained a complete set of invariants, one had 
exhausted all the information supplied by the symmetry. However, this is not 
so. Within the spaces of functions defined on the matrices there is a deeper and 
much more extensive structure associated with the symmetry, which is revealed 
by the theory of group representations. This leads to a generalized Fourier or 
harmonic analysis of the (scalar valued) functions on the positive definite sym- 
metric matrices. They are seen to be a system of generalized spherical functions, 
under congruence transformation by the linear group, in the sense of Berezin 
and Gelfand [1] and Godement [5]. Functions of the latent roots of the deter- 
minantal equation (14), being functions of two matrices invariant under their 
simultaneous transformation, play a very special role; they are the zonal func- 
tions. 

We shall restrict our study to complex valued polynomial functions of the ele- 
ments of a positive definite real symmetric k X k matrix A. When we come to the 


evaluation of the integral (8), we can expand the exponential in the integrand 
in a power series 


(16) > + (tr(BHAH’))’ 
y=o f! 
whose terms will be polynomials in the elements of A and B and our results will 
be applicable to these. 
Corresponding to a congruence transformation 


(17) A— LAL’ L € G(k) 


of the space of positive definite real symmetric matrices A, one can define an 
induced linear transformation of the polynomials ¢( A) in the elements of A with 
real or complex coefficients, 


(18) o(A) — (Ly)(A) = ¢(L"AL” ) L ¢ Gk). 


(18) is a representation of G(k) in the vector space of all polynomials ¢( A). 

The first problem is to decompose the space of polynomials into its irreducible 
invariant subspaces and find out which irreducible representations of G(k) are 
present. Since the transformation (18) maps any monomial into a homogeneous 
polynomial of the same degree, the vector space V, of homogeneous polynomials 
of degree f is an invariant subspace of the space of all polynomials. Let us con- 
centrate upon this. 

From the results of Littlewood [9] and Foulkes, [4] it can be shown that the 
space V, decomposes into the direct sum of m irreducible invariant subspaces 
V;.» where m is the number of elements in the set P(f, k) of partitions p = 
(fi, f2,°°*), of f into not more than k parts. f = f; + f2+ --- 

(19) Vr>= @® Vey. 

peP(sk) 
In each of the V,,, a separate irreducible representation, namely {2f, , 2f2 , ---}, 
of G(k) acts. The symbol {2f, , 2fe , ---} denotes the irreducible representation 





LATENT ROOTS 155 


of G(k) corresponding to the Young symmetry diagram whose rows are of length 
2f: , 2f2, «++ respectively. See Boerner [2]. 

Now consider the vector space V;,, for any p, under the transformations (18) 
but with the matrices L restricted to be orthogonal, 


(20) ¢(A) > ¢(H*AH™ ) H © O(k) 


Since O(k) is only a subgroup of G(k), V;,, will not be irreducible under it in 
general, but will decompose into a direct sum of irreducible invariant subspaces 
Vip. 


(21) Vin = Vena @ Ven2 ® --- 


of which there will be a unique one, say V;,,., , which is one-dimensional and is 
generated by a polynomial Z,(A) which is invariant under orthogonal trans- 
formations (20). Z,(A) is called a zonal polynomial. Being invariant under (20), 
it must be a symmetric function of the latent roots 4, , --- , & of A. Zonal poly- 
nomials of low order are given in the appendix. 


4. Evaluation of the integral. 
THeEoreM 2. If A and B are symmetric k X k matrices, O(k) the group of orthog- 
onal k X k matrices and d(H) its invariant measure, then 


, = l 
(22) hii eS Cee) age aa) 
c Pp) 


peP(jk) Z,(T) 


where P(f, k) is the set of partitions p = (fi, fe, ---) of f into not more than k 
parts and Z,(A) is the zonal polynomial corresponding to the partition p and hence 
to the representation {2f; , 2f2, ---} of the linear group G(k). 

Z,(A) is a symmetric polynomial in the latent roots 4,, --- , & of A, which 
can thus be written as a polynomial in the sums of powers 8; = >.>=: (2 = tr(A’), 
of these roots. c(p) is a constant and Z,(J), the value of Z,(A) at A = J,isa 
polynomial of degree f in k. 

Proor: Consider the function (tr (BA))’. Since it is a homogeneous poly- 
nomial of degree f in both B and A it belongs to the direct product V; K V, of 
V, with itself. Assume a basis has been chosen in each V,,,,; for all p and i. Then 
any element of V; X V, is a unique linear combination of terms each of which 
is the product of a basis function of B with a basis function of A. In particular 
(23) (tr(BA))’ = SY > ce Z,(B)Z,(A) + other terms 


peP (fk) qeP(/k) 


Z,(B)Z,(A) 


where each of the other terms has at least one of its two factors in a V,,,., with 
i > 1, i.e. not a zonal function. 

Let A and B be transformed by an arbitrary L «G(k), A cogrediently and B 
contragrediently. 


(24 A— LAL’ L eG(k) 
24) ; é 
B-» L™ BL“ 





156 ALAN T. JAMES 


Then tr(BA) — tr(L"’BL"LAL’) = tr(L"’ BAL’) = tr(BA), ie. tr(BA), 
and hence (tr(BA))’, are invariant. 

In a direct product of an irreducible covariant space V;,, with an irreducible 
contravariant space V;,,, there will be an invariant if and only if the represen- 
tation in V;, is the contragredient representation corresponding to the co- 
gredient representation in V,,,, i.e. if and only if p = gq. Hence the expansion 
(23) of (tr(BA))’ contains only those terms whose factors belong to subspaces 
V;.» with the same index p. Thus c,, = 0 if p ¥ q, and both factors of the “other 
terms” must belong to subspaces with the same p, i.e. we cannot have terms 
belonging to Vy5.i * Vy.¢,; with p ¥ gq. 

As (tr(BA))’ is invariant under orthogonal transformations (20), we must 
likewise have i = j for any nonzero term. In summary, all terms belong to sub- 
spaces of the form V;5,; X Vy,»,;, and if one factor of a term is a zonal function, 
so is the other. 

Averaging over O(k) annihilates all irreducible invariant subspaces other than 
those in which the identity representation acts. These it leaves unaltered. Thus 
the vectors in the V;,,,, for i > 1 are all mapped on zero but the zonal poly- 
nomials in the V;,,.. remain unchanged. Therefore all the ‘‘other terms” in the 
expansion (23) disappear under the averaging process and we have 


(25) [ (tr(BHAH’))’ d(H) = >> typZp(B)Z,(A). 
O(k) peP (fk) 
When calculating zonal functions, one can find coefficients c(p) such that 


; Ke 1 
ce (AY = TEE OF HD vebttny (P)26A). 


Substituting B = J, in (25) we have (tr A)’ = DctppZp(I)Z»(A). Therefore 


eines io asm slg i 
pp 13.5.-++.(2f — 1)Z,() 


and 


wid: (P) 7 (B)z,(A) 


, f ot ee 1 Bem li: 
J, (e(BHAH)Y aH) = 13.--(2F — 1) vee» Zp 


from which the theorem follows. 


5. The distribution of the roots. If i, , --- , are the latent roots of the covari- 
ance matrix nA, then t; = nl; . 
The convergence of the series can probably be improved by writing 


(27) etr (—43= 'HAH’) = etr (-3 4) etr (BHAH’) 
oT 


where B = (1/20°)J] — 4 =", and the constant o’ is chosen to give optimum 
convergence. 





LATENT ROOTS 157 


TueoreM 3. The distribution of the latent roots l,, ---, i of the covariance 
matriz calculated from a sample of N = n + 1 observations from a normal k-variate 
population with covariance matrix = is 


(2) c (II “)” exp ( 53 > ) (It * sate Il (l, — ly) 


1 
(28) $ n’ e(p) 
f=0 f11.3.5.- ° - (2f - 1) peP (sk) Z,(1) 
*Z,(Bi,°** , BedZ ph, ---,) dh--- dh 
where Z,(l,, ---, i) are zonal polynomials, 


1/1 1 : 
a =3(4-3) t=l,---,k, 


o is an arbitrary constant chosen to optimize the convergence of the series, o/, 
1 = 1, ---,k are the latent roots of the population covariance matriz and the constant 
c is given in (3). 

The zonal polynomials Z,(l; , ---, &) are listed up to f = 4 in the appendix 
as functions of s; = > '.,2, together with the constants c(p) and Z,(/). The 
symbol P(f,k) is explained in theorem 2. 

Roy [11] discusses significance tests and confidence intervals based on the 
roots distribution. 


APPENDIX 


Zonal polynomials of the representation of the linear group in the space of 
polynomials in the elements of a positive definite real symmetric matrix. (8; is 
the sum of the ith powers of the latent roots of the matrix.) 


Degree | Partition | 
J | p 





(1) 81 





si + 28s k(k + 2) 
st — 8 2 kik ~~ 1) 
81 + 68:82 + 885 k(k + 2)(k + 4) 
81 + 8182 — 283 k(k + 2)(k — 1) 
| 8; — 38:82 + 285 k(k — 1)(k — 2) 
 seeenssnidisiadhioalinecitimatl b cist chai bitieadalllcisiienasiabik 
st + 12s2ee + 1263 + B2s.05-+ 48e,| 1 | K(k + 2)(k + 4)(k +6) 
8} + 58is, — 28) + 408,— 8 | kik + 2)(k + 4)(k — 1) 
| 81 + Qsise + 783 — 8sis,— 285, | k(k + 2)(k — 1)(k + 1) 
8 — 818, — 283 — 28:8, + 4a, j k(k + 2)(k — 1)(k — 2) 
| si — 6siss + 383 + 8ei8;— 6a | k(k — 1)(k — 2)(k — 3) 





ALAN T. JAMES 


REFERENCES 

{1} F. A. Berezin anv I. M. Gevranp, “Some remarks on the theory of spherical functions 
on symmetric Riemannian manifolds,’’ Trudy Moskov. Mat. Ob&é., Vol. 5 (1956), 
pp. 311-351. Amer. Math. Soc. translation to appear shortly. 

(2) H. Boerner, Darstellungen von Gruppen, Springer (1955), Berlin. 

(3) R. A. Fisaer, ‘‘The sampling distribution of some statistics obtained from nonlinear 
equations,’’ Ann. Eugenics, Vol. 9 (1939), pp. 238-249. 

[4] O. H. Fou.kgs, “‘Plethysms of S-functions,’’ Philos. Trans. Roy. Soc. London. Ser. A, 
Vol. 246 (1954), pp. 555-591. 

[5] R. Gopement, ‘‘A theory of spherical functions 7,’’ Trans. Amer. Math. Soc., Vol. 73 
(1952), pp. 496-556. 

(6) P. L. Hsu, “On the distribution of the roots of certain determinantal equations,”’ 
Ann. Eugenics, Vol. 9 (1939), pp. 250-258. 

[7] A. T. James, “The noncentral Wishart distribution,’’ Proc. Roy. Soc. London. Ser. 
A, Vol. 229 (1955), pp. 364-366. 

[8] A. T. James, “‘A generating function for averages over the orthogonal group,’’ Proc. 
Roy. Soc. London. Ser. A, vol. 229 (1955), pp. 367-375. 

[9] Duptey E. Lirrtewoop, The theory of group characters and matriz representations of 
groups, secon’ edition, Oxford, 1950. 

[10] 8. N. Roy, ‘‘p-statistics or some generalizations in the analysis of variance appropriate 
to multivariate problems,’’ Sankhya, Vol. 4 (1939). pp. 381-396. 

[11] S. N. Roy, Some Aspects of Multivariate Analysis, John Wiley and Sons, New York, 
1958. 





TWO-SAMPLE TESTS FOR MULTIVARIATE DISTRIBUTIONS' 
By Lione, Weiss 
Cornell University 

1. Introduction and summary. X(1), X(2), --- , X(m), Y(1), Y(2),--- , ¥(m) 
are independent k-variate random variables. The distribution of X(i) has pdf 
f(z), say, where z denotes a k-dimensional vector throughout this paper, and the 
distribution of Y(j) has pdf g(x), say. We assume that f(z) and g(x) are piecewise 
continuous, and that each has a finite upper bound, which it is not necessary to 
specify. 

Denote by 2R; the distance from X(i) to the nearest of the points X(1), ---, 
X(i — 1), X(i + 1), ---, X(m), and denote by S; the number of points Y(1), 
-++, ¥Y(n) contained in the open sphere {z: |z — X(i)| < Rd. Clearly, the 
joint distribution of S;, S; is the same as the joint distribution of S,; , S, , for 
any subscripts with i ~ j, 1’ # 7’. Let r be a non-negative integer, and a any 
fixed positive value. Q(r) denotes the Lebesgue integral 


2H af*(x)[g(x)I" 
wy (g(x) + 2af(z)\* ~’ 


where E, denotes Euclidean k-space. We will show that 
ee Pa n[Si = 6, S: = 82} = Q(8:)Q( 8), 


for any non-negative integers 8,8 , the approach being uniform in 4, & . Thus, 
in the limit S,, S: are independently distributed, with 


lima.o,n/nme Pa alSi = %] = Q(%). 


In [1], which discussed the univariate case, S; was defined as the number of 
Y’s closer to X(i) than to any other X to their right. In the present paper, S; is 
defined as the number of Y’s in another neighborhood of X(¢). Our present 
definition of S; does not become for k = 1 the same as the definition of S, in 
[1]. Rather, in the univariate case, our present definition of S, is the number 
of Y’s lying within a distance R; on either side of X(i). However, if 


Eme.a.n/e-e Pal Si = &, S2 = 82| 


is computed for the univariate case using the definition of S,; given in [1], the 
only way in which it differs from Q(s,)Q(s) is that a is replaced by a/2. Thus 
it seems reasonable to treat the S; as defined here as k-dimensional analogues 
of the S, as defined in [1], at least for large sai..ples. An intuitive reason for a 
being replaced by a/2 is that in our present case, >-7-, S; may be less than n, 
whereas in [1] this sum must always equal n. Thus in our present case, we are 


Received September 29, 1958; revised August 17, 1959. 
? Research sponsored by the Office of Naval Research. 


159 





160 LIONEL WEISS 


in a sense discarding some of the Y’s, which lowers n relative to m and thus 
raises a by a certain factor (2, as it happens). In our present case, >. S; may 
be less than n because the R; are chosen to make the spheres around the X’s 
non-overlapping, thus simplifying the analysis. The R; were chosen to give the 
largest possible non-overlapping spheres because it would seem intuitively that 
the larger the spheres, the more rapid the approach of the probabilities to their 
limiting values. 

2. Derivation of the limiting distribution of S; , S, . Let pn(ri , r2 | z(1), x(2)) 
denote the joint conditional pdf of Ri, R. given that X(1) = z(1) and 
X(2) = 2x(2). Denote f\,-.;<s f(x) dx by V(a, b; f), where a ¢ E, and bisa 
positive scalar. If f(x) is continuous in an open region containing the open 


sphere {z:|z — a| < bd}, then dV(a, b; f)/db is equal to the surface integral 
S s:\2-e\-0 f(z) dS, which we denote by S(a, }; f). 


P(R; 2 rm, and R, 2 r2| X(1) = x(1) and X(2) = 2x(2)) 
= [1 — V(x(1), 2 5; f) — V(x(2), 2re 5 f))"™~ 


if r, 2 0, r = 0, | 2(1) — 2(2)| = 2 max (nr, 172); and is equal to zero for 
other values of r; , r2, z(1), 2(2). If f(z) is continuous in an open region con- 
taining the points z with |x — z(t) | S 2r; fori = 1, 2, then 


Pm(Ti, f2|x(1), 2(2)) 
2 
= —°_ Pik, 2 n1,R: 2 |X(1) =2(1) and X(2) = 2(2)] 
Or, Ore 


= (m — 2)(m — 3)[1 — V(x(1), 2n;f) — V(2x(2), 2re; f)|"™ 


- T] 28(2(4), 2r;; f) 


if r; = 0, re = 0, | z(1) — 2(2)| = 2 max (r,, 72); and is equal to zero for 
other values of r; , re, 7(1), (2). From our continuity assumption on f(z) we 
have 


V(x(i), Qre5f) = f(x(a)) [we (2rs)*/P (4k + 1)] + ri e(2(i); 2r0), 


and 
S(x(i), 2rs 5 f) = f(x(i))kw™(2r,)""/T (Hk + 1) + ro *a(2x(i); 204), 


where «( ) and«(  ) approach zero as r, approaches zero. Furthermore, these 
quantities approach zero uniformly over any set G of points in (z(1), z(2)) 
space such that f(z) is uniformly continuous over the projection of G on the 
z(1) hyperplane and on the z(2) hyperplane and | z(1) — 2(2)| > 6 > 0 
over G. 

Now introduce the random variables Z; , Z: by the relationship R; = (Z;/m)'", 
fori = 1, 2. Denote by h,.(2; , 22 | z(1), 2(2)) the joint conditional pdf of Z; , 





TWO-SAMPLE TESTS 


Z: given that X(1) = x(1) and X(2) = 2(2). By substituting in 
Pm(i, T2| 2(1), 2(2)) 
and using the facts developed above, we have 
2 —\k ae 5 
. (2+/x) ‘ { (2+/x) “| 
( = we to tas 
lim hn (21, 22 | 2(1), 2(2)) 1 GE = py PEt) P| — irae + y SO) 


uniformly over any set G in (z(1), z(2), 2, 2) space such that f(z) is uni- 
formly continuous over the projection of G on the z(1) hyperplane and on the 
x(2) hyperplane, | (1) — z(2) | > 6 > 0 over G, and the projections of G on 
the z, and z axes are bounded from above. We need consider only positive 2, , 2: . 

Next, denote by D,.( 8: , 8 |2; , 22, 2(1), x(2)) the conditional probability that 
S,; = s, and S, = s,, given that Z; = 2, Z, = z,, X(1) = 2(1), X(2) = 2(2). 
Then 


Du 81, 8 | 21, 22, 2(1), z(2)) 


n! 


~ glal(n — 6, — &)! (1 — V(z(1),n;9) — V(2(2),172;9))” 


2 
“Ii V"(2(i), 59) 
if |z(1) — 2(2)| > 2 max (rn, rz). It is easily verified that 
lim Dya(s, 8| 21, 22,2(1), 2(2)) 


=. feetOT ~ ee 


uniformly over any set G; of points in (z(1), z(2), 2, 22) space such that g(x) 
is uniformly continuous over the projection of G, on the z(1) hyperplane, and 
over the projection of G, on the z(2) hyperplane, and the projections of G, on 
the z, and z, axes are bounded from above. 

Given any positive «, we can find a subset K(«) of (z(1), 2(2)) space such 
that P(X(1), X(2) in K(e)) 2 1 — ¢, f(z) is uniformly continuous on the pro- 
jection of K() on the z(¢) hyperplane (i = 1,2), and | 2(1) — z(2) | > 8(e) > 0 
at each point of K(e). 

Since 


Pal Si = &, S: - 8) 


re I, I, Lt [ Drw( $1, 82| 21, 22, (1), 2(2))m(2s 5 20| 2(1), 2(2)) des da | 
f(@(1))f(2(2)) de(1) dz(2), 





162 LIONEL WEISS 


we have 


Pan(Si = %,8 = 0%) — ff ee Dal Yhm() des des | 


-f(z(1))f(a(2)) dx(1) dr(2) Se 


for all values of m and n. From our discussion above, it can be seen that 


lim [ [ vn Yim ( dade = [ [ lim Da( )hm( ) der dz» 


mn~n 0 0 Mm n-w 
m/n—a m/n=—a 


- 2‘ af (x(i) )ig(x(a) 
sai [g(x(t)) + 2taf(x(i)) | 


uniformly over K(¢). This means 


lim it Dal )Im( ) des den | f(e(1))f(a(2)) ae(1) de(2) 


m/n=—a 
= ff i Sebtatintatacon 
K(e) i=l [g(x(2)) + 2af(x(1))\***? 


This last expression differs from Q(s8,)Q(s2) by less than ¢«, and this completes 
the demonstration, since « can be taken arbitrarily close to zero. The uniformity 
of approach follows from the equalities 


a Pan(S; = 8, S: = &) = 2 2 Q(8,)Q(s%2) = 


f(x(1) )f(a(2)) dx(1) dx(2). 


41-42 
the summations extending over all non-negative integers. 


3. Applications. For each non-negative integer r, let Q,,(r) denote the propor- 
tion of the values S,, ---, S,, which are equal to r. We will show that Q,.(r) 
converges stochastically to Q(r) as m, n increase with m/n = a. Define the 
random variable U; to be equal to one if S; is equal to r, and to be equal to zero 
if S; is not equal to r. Then Q,.(r) = (U; + --- + Um)/m, and 


E{Qn(r)} = E{Ui} = P(S, = r), 


Variance {Q,.(r)} = (1/m) Var{U + ((m — 1)/m) Cov{U,, U2}. But 
Variance { U,} is equal to P(S,; = r)[l — P(S,; = r)], and Cov {U,, U2} is equal 
to P(S,; = rand S. = r) — P(S, = r)P(S: = 1). Since we showed above 
that S,; and S, are asymptotically independent, it follows that Variance {Q,,(r)} 
approaches zero. It has also been shown that P(S, = r) approaches Q(r), so 
it follows from Chebyshev’s inequality that Q,.(r) converges stochastically to 
Q(r). 

Take the case r = 0. In evaluating Q(0), [g(z)]° is taken to be unity even if 
g(x) = 0. Then 


i ee _ Va a Vo(z) + afl) of) dz >0, 
Be Lv/g(x) + 2af(z) 1+ 2a 





TWO-SAMPLE TESTS 163 


with equality holding if and only if f(z) = g(x) almost everywhere. Expanding 
the integrand and integrating, we find that Q(0) 2 (2*a)/(1 + 2*a), with 
equality holding if and only if f(z) = g(z) almost everywhere. 

If it were desired to test the hypothesis that f(z) = g(x) almost everywhere, 
a reasonable test procedure would seem to be to reject if Q,.(0) is “too far” 
above (2*a)/(1 + 2*a). In [1] it was shown that the Wald-Wolfowitz run test 
{2} is equivalent to rejecting the hypothesis when Q,,(0) is “‘too large”, where 
Q,.(0) is defined using the S; of [1]. If we accept the analogy between the S, 
as defined in this paper and the S; as defined in [1], we see that the test which 
rejects when Q,,(0) is “too large” is a multivariate analogue of the Wald- 
Wolfowitz run test. 

However, there are certain difficulties in the way of using the test based on 
Q,.(0). Even if f(x) = g(x) almost everywhere, so that the hypothesis is true, 
the distribution of Q,,(0) depends on the common density function f(z). This 
is seen from an examination of the expression for the variance of Q,,(0). Thus 
the test based on Q,,(0) is not similar to the sample space, so the level of sig- 
nificance must be defined as the least upper bound of probabilities of rejecting 
the hypothesis when it is true. To be more precise, suppose we want the proba- 
bility of a type I error to be no greater than a preassigned value 8 (0 < 8 < 1). 
Fixing m, n, with m/n = a, we can write our critical region as 


Qm(O) 2 (2*a)/(1 + 2*a) + 5n(8), 


where 4,,(8) is chosen so that when g(x) = f(z) almost everywhere, 
lu.b.sa P[Qm(0) = (2*a)/(1 + 2a) + 6,(8)] SB. 


We can satisfy this inequality trivially by choosing 6,.(8) equal to 1 — (2‘a)/ 
(1 + 2*a), but then our level of significance is zero, and the test is of no interest. 
What we want, of course, is to have 4,,(8) small so that the power of the test 
is good. One way of guaranteeing a reasonably small value of 6,,(8) is to limit 
the class of density functions under consideration in some way. One such way 
is to assume that all the possible density functions have the following property: 
given any positive «, there is a positive y(«) so that the variation of the density 
function over any sphere of k-dimensional volume («) is no greater than ¢, and 
lim..o7v(¢)/e = c > 0. Then an examination of the argument of Section 2 will 
show that when g(x) = f(z), and f(z) has the continuity property just de- 
scribed, 


| F1Q,.(0)} — (2*a)/(1 + 2*a) | S Aj(c, m), Variance {Q,,(0)} < A2(c, m), 


where A;(c, m) and A,(¢, m) approach zero as m increases with m/n = a, for 
any fixed positive c. Chebyshev’s inequality gives 


P(Qn(0) 2 (2*a)/(1 + 2a) + Arle, m) + tf] S (1/t)As(c, m), 
and setting t = (1/8)A2(c, m) gives 
P{Q.(0) = (2*a)/(1 + 2a) + Ay(e, m) + (1/8) A2(c, m)] S B. 





164 LIONEL WEISS 


Thus 6,.(8) S A;(c, m) + (1/8)A2(c, m), and this upper bound for 6,,(8) ap- 
proaches zero as m increases with m/n = a, so that if the hypothesis is not true, 
the probability of rejection approaches one as m, n increase. The actual compu- 
tation of the functions A;(c, m), A,(c, m), though possible, would be quite in- 
volved, and will not be carried out here. 

The test based on Q,,(0) is, as has been noted, not similar to the sample space. 
To the author’s knowledge, no test of the hypothesis under discussion which 
has reasonable power properties has been shown to be similar to the sample 
space. The quantities Q,,(r) are invariant under translations and rotations of 
k-dimensional space, or under linear stretching of each of the k axes by the same 
factor. Intuitively, then, one would expect tests based on Q,,(r) to be closer to 
similarity than such tests as the chi-square test or the Kolmogorov-Smirnov test, 
in the multivariate case. 

Professor W. Kruskal has pointed out a lack of symmetry in the test based on 
Q,.(0) described above, in that interchanging the roles of X and Y gives a test 
statistic that is not in one-one correspondence with Q,,(0). Most other two- 
sample tests that have been proposed do not exhibit this lack of symmetry. A 
test which does not suffer from this lack of symmetry is one based on the average 
of Q,,(0) and the corresponding quantity given by interchanging the roles of X 
and Y. 


REFERENCES 


{1) J. R. Buum anv Lionet Wess, ‘“‘Consistency of certain two-sample tests,’’ Ann. Math. 


Stat., Vol. 28 (1957), pp. 242-246. 
[2] A. Wap ann J. Wo.row1Tz, ‘‘On a test whether two samples are from the same popu- 
lation,’’ Ann. Math. Stat., Vol. 11 (1940), pp. 147-162. 





A MODIFICATION OF THE SEQUENTIAL PROBABILITY RATIO TEST 
TO REDUCE THE SAMPLE SIZE' 


By T. W. ANDERSON 
Columbia University and 
Center for Advanced Study in the Behavioral Sciences 

1. Summary and introduction. The sequential probability ratio test is con- 
structed as a sequential test of one simple hypothesis against another. In many 
instances a parametric form is assumed for the density or (discrete) probability 
function, and the two simple hypotheses are specified by two values of the 
parameter. The sequential probability ratio test has an optimum property for 
these two hypotheses, namely, given such a test there is no other test with at 
least as low probabilities of Type I and Type II errors and with smaller ex- 
pected sample sizes under either or both of the two hypotheses. Usually, how- 
ever, one is interested in the performance of the procedure for more values of 
the parameter than these two. A disadvantage of the sequential probability 
ratio test is that in general the expected sample size is relatively large for values 
of the parameter between the two specified ones; that is, in cases in which one 
does not care greatly which decision is taken, a large number of observations is 
expected. The question is how to reduce the expected sample size for values of 
the parameter when this tends to be large. 

In this paper we consider a special case of the problem, when the distribution 
is normal with known variance and the parameter of interest is the mean. The 
sequential probability ratio test in this case consists in taking observations se- 
quentially and after each observation is taken comparing the sum of the ob- 
servations (referred to a suitable origin) with two constants. In this study the 
two constants are replaced by two linear functions of the number of observa- 
tions taken, and the taking of observations is truncated (Section 2). Approxi- 
mations to the operating characteristic (or power function) and the average 
sample size number are given (Section 4 and 5). Computations for two cases 
of special interest show a considerable decrease in average sample size at param- 
eter values between the two specified ones (Section 3). 

The problem is studied by replacing the sum of observations by the Wiener 
stochastic process (of a continuous time parameter); this can be thought of 
intuitively as interpolating between observations in a manner consistent with 
the addition of independent random variables. For this procedure we calculate 
exactly the operating characteristic, the distribution of observation time, the 
expected observation time, and related probabilities. 


Received July 24, 1959. 

1 This research was sponsored in part by the Office of Naval Research under Contract 
Number Nonr-266(33), Project Number NR 042-034. Reproduction in whole or in part is 
permitted for any purpose of the United States Government. 


165 





166 T. W. ANDERSON 


2. The problem and the procedures studied. Let f(z, @) be a family of densities 
or (discrete) probability functions of a scalar random variable X with @ a scalar 
parameter. Suppose that @ is unknown, and that we are going to take observa- 
tions on X to determine whether @ is large or small. One way of formalizing this 
problem is to say we are going to test the null hypothesis Ho that @ = 4 against 
the alternative hypothesis H, that 6 = 6, > @ where 4 and 6, are two suitably 


chosen numbers. The sequential probability ratio test is a procedure for this 
testing problem. Let 


(2.1) a(2) = log FEY, 


and choose two numbers a and b (<a). The procedure consists of taking ob- 
servations x, , 2, -** sequentially. At the mth step, if 


$2 atu'<a, 


take another observation; if the sum is not greater than b, accept Ho (equiva- 

lently reject H,); and if the sum is not less than a, accept H, (equivalently 
It is convenient to summarize the characteristics of this test (or any other 

sequential test) by two functions, namely, the operating characteristic 


(2.3) L(@) = Pr {accepting Ho | 6} 


(which is the complement of the power function) and the expected sample size, 
&m, which is the expected number of observations when sampling from f(z, 4). 
In terms of these functions at 4 and 6, the sequential probability ratio test has 
an optimum property. If L(@) and &m are the operating characteristic and 
expected sample size for a sequential probability ratio test and L*(@) and Gen 
are the same functions for another test, then if 


L*(6) = L(%), L*(#,) Ss L(4), 


it follows that Eon = &,n and &9,n = &,n. That is, if the second test is as good 
as the sequential probability ratio test with respect to the probabilities of de- 
cisions when sampling from f(z, %) and f(z, 6,) it cannot be better (and usually 
will be worse) with respect to the expected number of observations. 

Usually, however, one is interested in the behavior of a procedure over a 
range of values of the parameter, not just a pair of values. In many situations 
one’s desire to take a certain action increases as the parameter increases; the 
customary way of setting up a sequential probability ratio test requires an 
evaluation of the desirability of taking actions relative to values of parameters 
by specifying two values, % and @, and the desired probabilities of actions at 
these two parameter values. This is a somewhat arbitrary way of formalizing 
the real life problem and is perhaps done mainly as a convenience for the theo- 
retical statistician. However, it may be that these requirements on the operating 





MODIFICATION OF SEQUENTIAL TEST 167 


characteristic for reasonable procedures control the operating characteristic 
enough so that it is satisfactory. On the other hand, controlling the expected 
sample size at these two values does not necessarily yield an expected sample 
size function that can be considered satisfactory over a range of parameter values. 
In particular, the sequential probability ratio test in many problems has an 
expected sample size function that is much higher for values of @ between 4 and 
6, than for these two values. Between 6 and 6, presumably one is less interested 
in which of the two actions is taken, but here is where one has to take a large 
number of observations. For example, in one case considered below the expected 
number of observations goes up to about 5/3 of the number at % and @, ; in 
another case it more than doubles. The question is whether there are other 
sequential procedures which will reduce the expected number of observations for 
parameter values in the middle of the range without increasing it much at 4% 
and @, . 

Another difficulty with the sequential probability ratio test is that for most 
cases the number of observations is a random variable which is unbounded and 
has a positive probability of being greater than any given constant. Since it is 
awkward to provide for taking an arbitrarily large number of observations, fre- 
quently the sequential probability ratio test is truncated; that is, if a certain 
number of observations are taken, then the process is stopped and a decision 
is made. This modified procedure (with different numbers a and b) may cc"- 
siderably increase the expected sample size at 6) and @,. We can also consic. * 
better methods of modifying the sequential probability ratio test so as to lim . 
the number of observations that can be taken. 

The sequential probability ratio test is defined in terms of Z, = >-T 2(a). 
The procedure can be described graphically in the plane of m and Z. There are 
two lines Z = a and Z = b. Sampling is stopped as soon as the sequence Z; , 
Z:, «+: leaves the strip between the two lines, and the decision depends on 
which line is crossed. One might consider modifications of these boundaries to 
obtain other procedures (called generalized sequential probability ratio tests by 
Weiss). If the densities have the so called Koopman-Darmois form, 


exp [a(@) + 8(6)y(x) + 8(z)], 


the probability ratio for any two values of @ depends on the observation only 
through 7(z), and inference can be based on a +(2z;) which is equivalent to Z. 
To control the expected sample size it seems reasonable to put an upper bound 
on the number of observations that can be taken. One also expects that a good 
procedure should lead to decision after a small number of observations if Z is 
either very large or very small. Another intuitive impression is that the bound- 
aries should be smooth. (The truncated sequential probability ratio test seems 
inefficient because if Z is large and the number of observations is near the 
truncation value a few additional observations will not permit much chance of 
rejecting H, whatever these observations are.) 

There are many ways of formalizing the problem that has been discussed. 





168 T. W. ANDERSON 


We can set L() and L(6,) and ask for the procedure that minimizes supy &n 
or that minimizes & +n for some specified 6* between % and 6, . In some cases 
the former specification reduces to the latter (for example, if there is sufficient 
symmetry ). 

Kiefer and Weiss [8] have shown that the procedures which minimize &&*n for 
various specified pairs L(6)) and L(6,) are essentially the class of solutions to 
the Bayes problems, in which the three parameter values are given a priori 
probabilities, say &, & and &*, and 


(2.4) fol — L(6o)] + &:L(01) + E*Esen 


is to be minimized. Furthermore, if f(z, @) is of the Koopman-Darmois form’ 
a Bayes procedure is defined in terms of the probability ratio (2.1) and has 
continuation regions 


(2.5) ba < ¥ 2(2.) <q Ou 


for m = 1, --- , M. At most M observations are taken, and the two sequences 
of numbers usually satisfy bn < bmi:, Gm4i < Gm. In principle, the fact that 
at most a given number of observations is to be taken permits computation 
of the Bayes solution; the expected loss at the Mth stage is a function of a 
posteriori probabilities; the best action for any such probabilities is clear. In 
turn, the a posteriori probabilities at the Mth stage depend on the a posteriori 
probabilities at the (M — 1)st stage and the Mth observation; this leads to 
computation of the best action or whether to continue sampling at the (M — 1)st 
stage. The computations can be carried back to the first observation. Unfor- 
tunately, even for the normal and binomial distributions these computations 
are laborious and the procedures do not seem to be easy to describe. 

In this paper we consider a particular testing problem, namely, that of the 
mean of a normal distribution with known variance. The procedures studied 
consist of pairs of straight lines (not necessarily parallel) with possible trun- 
cation. 

Observations are drawn sequentially from a normal distribution with mean 4 
and variance o°. We want to decide whether u is large or small given knowledge 
of oc’, and we want to keep the sample size down, particularly at moderate values 
of uw. We can put the problem formally that we want to test the null hypothesis 
# = wo against the alternative u = yw (4: > uo) and wish a procedure to mini- 
mize &,n at uw = 4(uo + mw) or alternatively to minimize the supremum of &,n. 
It is convenient to replace the observation z by the transformed observation 


* Assumption A of [8] essentially implies this for the three parameter values if 
f(z, 0:)/f (x, 09) takes on all values from 0 to». 

It might be helpful to readers of [8] to point out that for given a priori probabilities the 
a posteriori probabilities after m observations, say £o(m), &:(m), &*(m), lie on a curve 
E*(m) = Ck™t5(m)t“(m), where C > 0, k > 1,0 < ¢ < 1; each such curve lies nearer the 
point &* = 1 than the curves for observation numbers less than m. 





MODIFICATION OF SEQUENTIAL TEST 169 


[z — 4(uo + m:))/o and call u* = 3(, — )/c. Then the problem is to test 
the null hypothesis 1» = —y*(y* > 0) against the alternative » = yu* when 
sampling from N(y, 1). 

The probability ratio in this problem is z(z) = 2y*z. The sequential proba- 
bility ratio test has the continuation interval 


(2.6) b < Qy*>> x <a. 
1 


Equivalently, it can be represented as 


(2.7) b< Ya < 4, 


where 6 = b/(2u*) and 4 = a/(2y*). The test can be described graphically in 
the plane of m and y = >. 2,. There are two parallel lines y = 6 and y = 4. 
One plots successively the points (1, 2), (2, 2; + 22), «+ . As soon as a point 
is obtained which is not between the lines, sampling is stopped; if the point is 
on or above y = 4, Hp is rejected, and if the point is on or below y = 6, Hy 
is accepted. In principle 4 and 6 are selected to attain the desired L(u*) and 
L(— *), but in practice these are approximated. 

In this paper we shall consider replacing the parallel straight lines y = 4 and 
y = 6 by arbitrary straight lines y = c, + dym and y = c + dym with possibly 
truncation at N (that is, a line m = N). Such a procedure is as follows: Take 
observations 2; , 2, --+ sequentially. At the mth stage (m < N), reject Hg if 


(2.8) > 2 = 1 + dm, 
i 


accept Hy if 


(2.9) > zi S 2 + dm, 


and take another observation if 
(2.10) C2 + dam < dx < o + dym. 


If N observations are taken, stop sampling and reject Hy if >-!z; = & and 
accept Hy if >“! x, < k. We take ¢, > 0 > @. To avoid redundancy in the 
definition (that is, intersection of the lines before n = N), we require 


+ d(N —1) <a +4 (N — 1). 


From the intuitive considerations mentioned above we might surmise that the 
desirable procedures of this type are those for which d, < 0 < d,; that is, those 
for which the lines converge. 

To calculate the probabilities and expected values that are of interest is ex- 
tremely complicated and involved. However, we can calculate such quantities 





170 T. W. ANDERSON 


if we replace the sequence >-T 2; (m = 1, 2,---) by an analogous X(t) 
(0 < t < #). The random variable >-T z; is normally distributed with mean 
mp and variance m; the increment from m to m’, 7. Xr > x= Dat x 
is normally distributed independently of >-7 z;. These properties are retained 
in the Wiener stochastic process. Let X(t) be a Gaussian (i.e., normally dis- 
tributed) stochastic process with &X(t) = yt, &[X(t) — ut)’ = t and such that 
X(s) — X(t) is distributed independently of X(t) for s > t. Then >-T 2; has 
the properties of X(m). Throughout the paper we shall assume this process 
defined so that the probability is 1 that the path functions are continuous. Now 
for this family of processes we consider the problem of testing the hypothesis 
that 4» = —4y* against the alternative » = u*. The procedure is to observe X(t) 
continuously as long as 


(2.11) C2 + dot < X(t) <a + d,t 


andt s T. If X(t) = co + dt or X(t) = c. + dit, observation is stopped and 
in the first case Ho is rejected and in the second case Hp is accepted. If X(t) 
remains between the two lines, observation is stopped at t = T and Hp is re- 
jected if X(T) = k and accepted if X(T) < k. Since X(t) is continuous with 
probability 1, the inequalities in (2.11) are violated directly by an equality. 
Here we require 


(2.12) a+aTsksaun+aT. 


In this paper the probability of rejecting H» is computed as a function of yu, 
and the expected length of time of observation is found. It is proposed here 


that one can select c , c2 , d; , dz, T and k so as to achieve a desirable procedure 
for X(t); that is, obtain a specified significance level at —y* and a specified 
power at u* with some minimization of the expected time. Then the operating 
characteristic and expected time functions are approximations to the operating 
characteristic and expected sample number functions when observations are 
taken discretely. 

It is difficult to ascertain how good these approximations are. One might hope 
they are as good as for the sequential probability ratio test, which is the special 
case of d, = d, = 0 and T = o. In principle this problem could be studied by 
considering the case of observations taken discretely as observing X(t) at 
t = 1, 2,---. It is clear that applying the procedure when X(t) is only ob- 
served at discrete time points leads to decision later (or at least no earlier) than 
when observing it continuously. Hence, the expected time function underesti- 
mates the expected sample number function. In the case of the sequential 
probability ratio test the usual approximation underestimates the power func- 
tion when it is over $ and overestimates it when it is under 4 (that is, indicates 
poorer significance level and power at u* than is actually the case). Similarly 
here, at least when the lines converge to a point (c¢; + d:T = ce + dT), the 
same is true for these procecures. 

In the case of observatiors taken discretely, the properties of a procedure 





MODIFICATION OF SEQUENTIAL TEST 171 


depend only on u/c, and the scale can be changed so that ¢ = 1. In the case 
of the Wiener process we can change the scale on z and the scale on ¢. Let t = 8s 
and define X*(s) = aX(8s) (a > 0, 8 > 0). This has mean ay§s and variance 
a’Bs. If a8 = 1 [t = 8/a’ and X*(s) = aX(s/a’)}, the variance of X*(s) is 
s and the mean is («/a)s. The region (2.11) goes into 


(2.13) ate + (d:/a)s < X*(s) = aX(8/a*) < ac, + {dh/a)s, 


the bound ¢ = T goes into s = S = a’T, and the value k goes into ak. For 
any value of uw the probabilities for the procedure in terms of X*(s) are the 
same as for the related procedure in terms of X(t), and the expected observa- 
tion time in terms of X*(s) is a’ times the expected observation time in terms 
of X(t). Thus a problem stated in terms of arbitrary yo, w: , and o” is reduced 
to a problem in terms of wo = —y*, wy, = w* and o = 1, where y* is arbitrary. 
(The accuracy of these results for the continuous time parameter as an ap- 
proximation for the case of observations taken discretely would depend on what 
the original parameters were. ) 

The problem of moadying sequential analysis to reduce the sample size has 
been considered by several statisticians. In the literature the case of a normal 
mean has been investigated. Armitage [2] proposed straight line boundaries for 
a two-sided test of 4 = uo, converted to the Wiener process, but then only 
approximated the probabilities and expected time. Donnelly [3] has proposed 
straight line boundaries that meet and converted to the Wiener process; he ob- 
tained some results similar to those of this paper by a different method (that is, 
solutions of a partial differential equation satisfying certain boundary 
conditons). 


3. Numerical investigation of two cases. As will be seen later, the operating 
characteristic and expected observation time are complicated functions of the 
parameter » and the constants defining the procedure. It, therefore, seems hope- 
less to find analytically the optimum procedure within the class. Hence, two 
cases have been investigated computationally. The results given in Tables 1 
and 2 show the advantage of the best procedures in these two cases. 


TABLE 1 


Characteristics of Procedures with Probabilities of Types I and I1 Errors of 5% at 
w= —.land.i 


= Expected Time Expected Time 
Condition a= a= —.l and J 


Fixed size 270.6 270.6 
SPRT 216.7 132.5 
c+dT= 0 192.2 139.2 
c+dT = .le 529. 192.2 139.3 
c+dT = .% od 192.2 139.8 
c+dT = .3c . 192.4 139.4 





T. W. ANDERSON 


TABLE 2 


Characteristics of Procedures with Probabilities of Types I and II Errors of 1% at 
p= —.land .1 





| 7 | 
Condition T [_ 


Fixed size 
SPRT 22.976 
c+dT= 0 35.52 
e+ dT = .lc 35.52 
ec+dT = .2 35.52 


The specified pull hypothesis, 1» = —y*, and alternative hypothesis, nu = yu*, 
are symmetrically located about » = 0. We consider cases when the probabilities 
of Type I and Type II errors are equal; that is, 1 — L(—yu*) = L(u*). For 
this probability fixed we want to find the procedure in our class that minimizes 
the expected observation time at » = 0. Since this problem is symmetric [that is, 
X(t) can be replaced by —X(t)], it seems reasonable to consider only sym- 
metric procedures; that is, c, = —c, = c say, d; = —d, = d say, and k = 0. 
(In fact, if an asymmetric procedure solved the problem, its mirror image would 
and so would a symmetric randomization between these two procedures, but one 
can argue that in this problem nonrandomized procedures form a complete 
class.) For symmetric procedures the maximum expected observation time’ is 
at »p = 0. 

A symmetric procedure is defined by the constants c, d, and T. There is one 
condition imposed by specifying the (equal) probabilities of error, leaving two 
degrees of freedom in the constants. Subject to this condition the constants were 
varied to obtain the smallest expected observation time at » = 0. The most 
relevant results of these computations are given in Tables 1 and 2 for probabil- 
ities of equal Type I and II errors of .05 and .01, respectively.‘ 

The line z = c + dt has intercept c at ¢ = 0 and c + dT att = T; when 
c + dT = 0 the two straightline boundaries converge to a point. For each of 
several values of the ratio of these two intercepts [(c + d7T)/c = 0, .1, .2] the 
tables give the combination of c and T (and hence d) that approximately mini- 
mizes the expected observation time at u = 0. 

The most interesting features of the numerical results are the comparisons of 
expected observation times between the sequential probability ratio test (SPRT) 
and the procedures for c + dT = 0 (lines converging to a point). The convergent 
line procedures show a considerable improvement over the sequential proba- 


* This is a consequence of the fact that for a symmetric provedure the probability that 
the observation time exceeds a given value of ¢ is maximum for uw» = 0 (and is increasing 
for « < 0 and decreasing for u» > 0). The latter follows from Corollary 5 of [1] applied to 
X(t)/(e¢ + dt). This demonstration was suggested by Hoeffding [7]. 

‘I wish to acknowledge the computational assistance of Mrs. Judy Frankman and Mr. 
George Bump. 





MODIFICATION OF SEQUENTIAL TEST 173 


bility ratio tests at 1 = 0 with a moderate decrease in efficiency at py = —.1 
and u» = .1l. At the 5% levels the expected time for convergent lines at » = 0 
is 24.5 less than for the usual procedure and is 6.7 more at uw = +.1 (a ratio 
of 3.7 to 1); at the 1% levels it is 125.7 less at » = O and 24.2 more at wu = +.1 
(a ratio of 5.2 to 1). Roughly speaking, we can say that when one operates at 
the 5% levels he is better off with the convergent lines procedure if intermediate 
values of » occur at least 4 of the time and when one operates at the 1 % level 
if intermediate values occur at least } of the time. 

Hoeffding [7] has given a general lower bound for the expected sample size 
of a sequential procedure at. one parameter value when the probabilities of error 
are specified at two other parameter values (assuming the variance of the sample 
size is finite). His lower bound for the expected time at » = 0 is 187.0 for the 
case in Table 1 and 388.3 for the case in Table 2. Thus the best procedure re- 
ported in Table 1 accomplishes at least 82% of the possible improvement over 
the sequential probability ratio test at the 5% level and at least 90% at the 
1% level. The lower bound given by Hoeffding cannot be achieved. While it is 
unknown how much this lower bound underestimates the minimum expected 
sample size, the comparison between the bound and the results in the tables 
shows that the given bound does not underestimate the minimum by much 
and that the tests presented in this paper come close to yielding the minimum 
possible expected sample size. 

A combination of ¢ and T in the tables yields the required probabilities of 
Types I and II errors to 6 or 7 decimal places. The expected times are reported 
to one decimal place; there may be an error of .1 (or occasionally even .2) in. 
these numbers. In particular in Table 2 the difference in the expected times at 
up = 0 between c + dT = 0 andc + dT = .\c is not significant (402.17 com- 
pared to 402.147). For the values of c given in the table one cannot distinguish 
between the cases c + dT = 0 and c + dT = .lc because in the latter case 
the probability of reaching a decision at t = T is almost 0. 

As will be seen later, the probabilities and expected times can be given as 
infinite series of terms involving Mill’s ratios. It was convenient to use tables 
[5], [9] (which were extended for these computations) where these are given to 
5 decimal places, and this determined the eventual accuracy of the calculations. 
A good guess is that more accurate computation would not yield minimum 
expected times that differ from the figures in the tables by more than .1 or .2. 

For a given ratio of (c + dT)/c the value of c that minimizes the expected 
time cannot be determined very accurately. For example, at the 1% levels at 
c+ dT = .\c the expected times at u = 0 were 402.21, 402.15" and 402.25" for 
c = 34.695, 35.52, and 36.345, respectively. Of course, since the functions are 
flat it is of no importance to obtain an accurate determination of where the 
minimum is. At the 1% levels the computations were done by setting c and 
adjusting T (and hence d); at the 5% levels they were done by setting +/T and 
adjusting’ c. The variations in ¢ given in Table 1 are of no consequence; a more 


5 The former procedure is much preferable for comparing the different ratios of (c + dT)/c. 





174 T. W. ANDERSON 


accurate determination of the c’s minimizing the expected times at » = 0 should 
find them much closer’ than in Table 1 and not all equal as indicated in Table 2. 

The computations have been done on the basis of the Wiener process and are 
considered as approximations to the problem of sampling discretely (as described 
in Section 2). It can be expected that the errors of approximation are greater 
than the errors of computation. Thus to the extent that one accepts the ap- 
proximation one can consider the procedures given in the tables for c + dT = 0 
and c + dT = .lc as procedures nearly minimizing the expected sample size 
atu = 0. 


4. Probabilities of error for the Wiener Process. 

4.1. Outline of derivations. The process X(t) with mean &X(t) = ut and vari- 
ance ¢ depends on a single parameter u. The probability of accepting Ho is the 
probability of the process X(t) touching the lower boundary z = c, + dst before 
touching the upper boundary z = c, + d,t and before t = T plus the probability 
of the process staying between the boundaries tot = T and X(T) s k. This 
probability is the operating characteristic, which we shall denote by L(x). Its 
complement 1 — L() is the power function. 

Let P,(T) be the probability that the process touch the upper boundary be- 
fore touching the lower boundary before t = T and P,(T) be the probability 
that the process touch the lower boundary before touching the upper bound- 
ary before t = T. Then Po(T) = 1 — P,(T) — P2(T) is the probability 
that the process stay between the boundaries to t = T. In this section we 
shall find expressions for these various probabilities. 

We can let 


(4.1) X(t) = Y(t) + ut, 


where Y(t) is a Wiener process with Y(t) = 0 and gY*(t) = t. Then X(t) = 
c; + dit is equivalent to Y(t) = ec; + (d; — uw)t and X(T) s k is equivalent 
to Y(T) s k — wT. It will be convenient to obtain some of the results for 
Y(t) and then convert them back to X(t). 

The (unconditional) distribution of Y(7') is normal with mean 0 and vari- 
ance 7. Given Y(T7') = y, the process Y(t) is Gaussian (normal) with a certain 
expected value function and covariance function. We obtair P;(y, 7) by finding 
the conditional probability of touching the upper boundary before the lower 
boundary and before t = T given Y(7) = y and then taking the expected value 
of this conditional probability relative to the marginal distribution of Y(T). 
The process Y(t) conditional on Y(7') can be transformed into the Wiener 
process by a transformation which carries the original straightline boundaries 
into other straightline boundaries. The problem then becomes finding the 


* The lack of monotonicity in the last column of Table 1 with respect to (¢ + dT7’)/c 
appears to be due to c = 20.340 being larger than the other approximately minimizing values 
of c, which in turn seems due to variations in computing procedures. Since it is hoped to do 
& more extensive numerical investigation on a high-speed computing machine, it has not 
seemed worthwhile to carry out the present calculations more accurately. 





MODIFICATION OF SEQUENTIAL TEST 175 


probability that the Wiener process touches an upper straightline boundary 
before a lower one; this problem is handled first. 

4.2. Probability of going over one line first. The first problem is to find the 
probability of touching one boundary before the other when the process can go 
on without limit (7 =  ). In this case the lines do not converge for then the 
process could not go on beyond the point of intersection of the two lines. 

Turorem 4.1. If Y(t) is the Wiener process with 8Y(t) = 0 and &Y*(t) = t, 
then for v1 > 0, 2 < 0 and 6; = 52 (not 6; = b2 = 0) the probability that 
Y(t) 2 y + &t for a smaller t than any t for which Y(t) S v2 + det is 


2 
> fe tet rats +(r—1) 2 yghg—r(r—1) (7449 +724))! 


r=1 


se eo tir raba +7262) ¥182a—7(r +1) vatal) 
’ 


2 
ame > fg Ar rads +r? yghg—r(r—1) (7189+ 7241)! 


rel 


etl riba +7982) —r(r+1) ¥ybg—r(r—1) vail) 
’ 


4 3 0, 
ha’ 5, = 5, = 0. 


Proor. Let A; (i = 1, 2,---) be the event of a path y(t) touching (or going 
over) the upper line and then touching (or going below) the lower line and 
alternating touching the upper and lower lines until each has been touched 
i — 1 times followed by touching the upper line’; let B; be the event of touching 
the lower line and then touching the upper line and alternately touching the 
two lines until each has been touched i times. Then the event whose probability 
we are finding, namely, touching the upper line before the lower line, is 


(4.3) A; — B, + As — Ba + ::-. 
For 5; > 0, & < 0 Doob [4] has shown that the probability of A, is 


a = Med B 18s + (r—2) 8 9h — | (2e—1) 81-7189 +7981) 
= 


(4.4) 


etl iti tr ) 2 y9b—g—r(r—1) (7459+ 7281)) 


and the probability of B, is 


8, = en tear? ¥ 184 +(2r)? ¥98q—(2r) (2r—2) ¥ 4 $g—2r(2r +2) 7944) 
x 


(4.5) 


a Pd 1184+ 7262) —1(r—1) 9162-1 (r +1) 7981) 


We shall derive these results and show that they also hold when 0 s & < 6. 
Then the theorem follows directly for 0 < 4,. 


7In “‘touching”’ one line before ‘‘touching”’ the other line the path may contact and 
cross the first line several times before contacting the other line. For example, a path is in 
A: if it has contacted the upper line at some t, , the lower line at some f; (t > t,), and the 
upper line at some ft, (ft; > t:) regardless of other contacts with the lines. 





176 T. W. ANDERSON 


The probabilities (4.4) and (4.5) result from a more general result given in 
Lemma 4.1. We consider a sequence of lines such that the odd-numbered lines 
are above the origin and each even-numbered line is below the origin and entirely 
below the odd-numbered line preceding it and the one following it in the sequence. 
We consider the event of a path touching the first line, then the second, 
until 2r — 1 are touched and also the event of touching the second, --- , until 
2r — 2 are touched. 


Lemma 4.1. If L; is the line 
(4.6) y = (—1)* ut + (-1)'",, 


v; > 0, tein = 0, —tai < Wir, —Uai < Unig: , the probability of the process Y(t) 
touching L, , Le, «++ , Ler1 in sequence is 


2r— 2r—1 i- 
(4.7) a(t, O15 + * 5 Morty Ve) = exp {- (= uy +2 > ‘uei)} 


t—2 jue 


and the probability of touching L2,--- , Lar» in sequence is 


2r—1 i—1 
(4.8) Bra(te, V25 *°* 5 ari, Vor1r) = exp {— (= uwn+2 > 2 uv,)} 


t—3 ju? 
Proor. For v1, > 0, u = 90, the probability of reaching the one line Z, is 


(4.9) ay (Uy 9 V;) = eos ° 


We shall reduce the other cases to this formula. Consider a path that touches 


L, , «++ , [na in sequence and let te» be the first value of ¢ for which the path 
touches Le,_2 after touching L, , --- , 2-3. The conditional probability of then 
touching Le,_; is 


(4.10) ge Marilee 1 tuar—2) fara tari ttar—a) 


for the line Le; has slope w,_, and has intercept (t2—2le2 + vere) + 
( tlap—ster2 + Vars) when referred to (t-2, — Usr—otler2 — Ver) as origin. This 
conditional probability is the same as the conditional probability of touching 
the line with slope t,_; + U2 and which is at tf = ty,» a distance of 


(4.11) bw Uar—sl (Ura ++ Uor—2)tor—2 + Vora + Vaya] 


Ursa + Usr—2 





above Ls,» . This is also the same as the conditional probability of touching the 
line with slope — (te, + U2—2) and which is at ¢ = ty_2 a distance of h below 
Ler-2 (since Y(t) is symmetrically distributed and has independent increments) 
This last line is 


(412) y @ —(wee + tei)t — + Uap War-2 + 2tlrr_ior_s 
7" i + Urr1 + Ure 


which does not depend on ts2. Thus the probability of touching in sequence 
L, , --+ , Ler; is the same as the probability of touching in sequence J, , 


’ 


. 
, 





MODIFICATION OF SEQUENTIAL TEST 177 


Le,-2 , and the last line (4.12). This last line, however, lies entirely below L2,_» 
and can be touched only if the path has touched 2,2 (which lies entirely below 
Les). Hence, this is the probability of touching in sequence 1, , --- , Les, 
and (4.12). 

Now let us reduce this one step further. Let the line (4.12) be 


L*:y = —u*t — v*. 


Note that u* > 0 and v* > 0. Let ts be the first value of ¢ for which the path 
touches L»,; after touching the preceding lines in sequence. Then the condi- 
tional probability of touching L* is 
(4.13) ete ue +ugr— a) tars teetear— al 
By the same reasoning as before the probability sought for is 
py | Un, V1 5 °° * 5 Waray Vora; Urs + U*, 
(4.14) 
Ury—wWer—s + u*v* + 2u*vo, 3 
Ura + u* ; 


From this it follows 


a, (Uy , Uy 5° °° 5 Mert, Vara) 


(4.15) = Mr1 (u 915° °° 5 Ueray Var—e 5 Uses + Usr-s + Usr-1, 


Uay—aDar—s tar —2Vor—2 + Way—ar—t + 2(Uay—ador—s + Warr 4 tew-s)) 
Urr—g + Uae + Usr-1 : 


By carrying this procedure on, we arrive at 


(4.16) a,(w 9 M5 °°* 5 Mant, Vor-1) = a (x u;, XL ues ee) s 
u, 
which is (4.7). 

The other part of the lemma follows similarly. It will be noted that the con- 
ditions for the lemma can be reduced further. Pairs of successive lines should 
not intersect; the slopes should satisfy we, 2 0, a1 + Ure > 0, tea + 
Urr_2 + U3 > 0,---, a. u;, > 0. 

The probabilities used in Theorem 4.1 are 


(4.17) Ot, = Or(b,, 71 5 —82, —¥2 551,715 °°* 5b, M1), 
(4.18) Be = BA —b2, —¥2 551,715 °** 5b, M1), 


for 7: > 0, v2 < 0, db 2 0, db: = 6. 

Now consider the case 6, < 0. Then the probability is 1 that the upper line 
is touched at least once. Hence, the probability that the upper line is touched 
first is simply 1 minus the probability that the lower line is touched before the 





178 T. W. ANDERSON 


upper. This latter probability corresponds to the first part of Theorem 4.1 with 
(yi, 6:) and (—y2, —é) interchanged. 

Finally, the case 6; = 4 cannot be obtained directly from Lemma 4.1 but can 
be derived in a similar fashion. Suppose 6, = & > 0. Let L; be the line 


y = & + (—1)'"2;, v, > 0,5 > 0. 


Given that a path touches L2,_2, the conditional probability of then touching 
Lars is exp | —28(v2,-2 + v2--1)} since the last line has slope 6 and is ve. + ve 
above the next-to-last line. Given that a path touches L.2,_; it must touch Ls,_2. 
Thus given that a path touches L2,_;, the conditional probability of touching 
Loy—2 and then Le; is exp {| —26(v2--2 + Ve—1)} which is the conditional proba- 
bility of touching the line y = dt + (vas + va—2 + Ver_1). Thus 


(4.19) ar(6, v1) 5 —8, v2 5 8, Vs 5 +++ 5 8, Vera) 


= a,_4(6, % ; —65, V2 5 °° 5b, Vag + Vere + Ver-1). 
From this it follows that 


(4.20) a,p(8, 0) 5 —8, V2 5 +++ 5 8, Von) = ay(8, 0 +--+ + ve) = a, 


If vo1 = V1, V2 = —ye, and 6 = ,, we have 


(4.21) sal e ilrr ~(r—1) v9] 


Similarly 


—26 227-19, 


B-( —4, v2 ; 6, v3 5 +++ 5 8, Vera) = e€ 
and 


°° —286,(r—1) ¥1—72) 
(4.23) Bra =e ; ’ . . 


From these results, the third part of Theorem 4.1 follows. 

4.3. Conditional probability of going over one line first. We now find the proba- 
bility of touching one line first conditional on the path going through a certain 
point. 

TueoremM 4.2. If Y(t) is the Wicner process with &Y(t) = 0 sY*(t) = t, and 
if T, v1, ¥2, 6, 2 are numbers stich that y, > 0, v2 < 0,971 + &T 2 ¥2 + &T, 
T > 0, the conditional probability that Y(t) 2 y: + &t for a smaller t (t Ss T) 
than any t for which Y(t) S ye + det given Y(T) ='y (noty, + &T = y2 + 
6:T = y) ts 





MODIFICATION OF SEQUENTIAL TEST 
Py r. y) = 


9 (2/7) ie yaya +8, Tw) Hr 2¥9( ya+82T—w)—1r—1) [75 (1249 Tw) +72 (71 +41 Ty) |! 
' 


os eID lr rite Tw +79 v2 +69 T—y) Jr (r—1) 95 (ya tbe Tw) rr +1) vali ti T on 
, 


1+ 4T 2 y, 


a e@ T) (r2 {yyy 8g Ty) + 726 19482 T—y) |r 41) 14 (9 bg Tw) 1 (r- tivetva tei F-o)ly 
, 


v1 +4T Sy, 
e (2/ T) v2 (7 +417T—») = 1 
~ [anwar _ 7’ n+ oT = 12+ &T # y. 


Proor. Let Y(t| T, y) be the process Y(t) given Y(t) = y. Then 


(4.25) 6Y(t| T,y) = 


e 


(4.26) Var Y(t|T,y) =t- T 


Cov {[Y(s| T, y), Y(t | T, y)) 
(4.27) 


Define a new process by 


4 T+ul,, Tu 
», = a. 
(4.28)  Z(u) = —F l (7 - | 7, 


Then 
(4.29) &Z(u) = 0, 
(4.30) &Z'(u) = U, 
(4.31) &Z(v)Z(u) = », vs 4u. 


Thus Z(u) is the Wiener process. The event Y(t| 7, y) 2 v1 + 4:¢ is equivalent 
to 


T 


Zu) >n+ (* 


54 +6) u 





180 T. W. ANDERSON 
We note that 


ro m= 
ge the He +h, 


na +h>o 


if y, + &7' > y, that is, if y is below the intersection of the upper line and 
t = T. Now Theorem 4.2 follows from Theorem 4.1. 

4.4. The probability of going over one line first in a fixed time. Here we find 
P,(T), which is obtained from the following theorem. 

Turorem 4.3. If Y(t) is a Wiener process with &Y(t) = 0 and &Y*(t) = t 
and if T, v1, ¥2, 51, 8 are numbers such that 7, > 0, v2 < 0,1 + &7T 2 v2 + 


6&7, T > 0, the probability that Y(t) = 1. + dt forat s T which is smaller 
than any t for which Y(t) S yo + Set is 


PAT) =1- o( 5") 


Oi a eee 6 ge) 
2(r-y1—(r—1) v9) [7 81— (7-1) 8g) ill le BBE. | risen ae 
+2 {¢ ( VT 


sa go tir (ra ba trabe)—+ or “Dvidamr r+) v9 81) ( T + 2ry2 a (2 r — 


rel 


— etl Dri—r ral Lo) dv) [1 —@® Ga 


+ etm 84 +7252) —+ (1) v9.17 (+1) 71 59) 


[1 4 (37 + (2r + In — 2rv2 } 
. + tn = 2) ) 


where 


(4.33) &(z) = ae go au. 


Proor. The density of Y(t) at = T is n(y|0, T), the normal density with 


mean 0 and variance 7’. Then the probability of a path touching the upper line 
before the lower line for ¢ <= T is 


(4.34) r P,(T, y)n(y\ 0, T) dy. 





MODIFICATION OF SEQUENTIAL TEST 


The integration to y; + &T is 


yit41T ow 
(4.35) [ e. le irw re or in(y| 0, T) dy, 
rel 


“—2 


where 
gir(y) lyi—r'yn -—(r- 1) "v2 + r(r — 1)yy + r(r — 1)y!} 
+ kirl 
9 
= 7 wi-mn + (r — 1)y2} + karl, 


kup = rly: + iT) + (r — 1)’42(y2 + &T) — r(r — Lnilye + &T) 
—r(r— l)yyly + &T), 


2 


gay) 7 yin — rv + lr — In + lr + Dy} + bel 


9 
= r [(—rv + rve)y + kerl, 


ko = ry(y + 6T) + rye(y2 + &7T) — r(r — I)yilve + &:7') 
— rr + ljylyn + 47). 


(4.39) 


Since 


—oiry (2/T) Cr —D (ry Or D val ly 441 T—( ya44eT)! 
(4.40) e < e / 1 2 1 i ares 4 


(4.41) ear) s eT (Lr) 111 al yn tha 194897] 


for y S ¥: + &T the series in P,(T, y) is bounded in absolute value by a se- 
ries that converges when y; + 6,7’ > y2 + &T7 and hence the order of summa- 
tion and integration in (4.35) can be reversed. Then 


yit4iT ‘ 
/ er n(y|0, 7) dy 
yit4iT 1 
/ emcees e7 tt PD] oP +401 Ty,+(r—-D ye) 44h, dy 
=.) V/24T 


yittiT 1 
_ (2/1 @—D v2—771) 2417) (1/ (27) (y+2{ —Dya-771))? 
=e dy 
—«. 


V2eT ° 


= <6¢ 2ir-yi— (—1) ¥q) Ir 44 — (7 —D bq) 4 (at + 2(r — 1)¥: wid (2r =n) : 
VT 





182 T. W. ANDERSON 


Similarly 


yit6,T 


e 'n(y|0, T) dy 


J—o 


viteiT l 
—{1/ (27)) (y?+4yr (v¥2—71) +4k ar) d 
= — € y 
a V2eT 
viteiT 1 
a etl Dir8 ra)" kel [ ‘ 7 1! Grd) ter (ya)? 
: = V2"T 
a (De «= 
=e 2[r2 (7, 83-289) —7 (— Dy) bat (+1) 72.84) ® (= 2ry2 — (2r aii rn) 
VT 


The integration from y; + 67 to ~ is 


- l 
{y2/ (27)) 
e d 
~71+6\T V/24T y 


a x 


> {eer — & in(y | 0, T) dy, 


Vit6,:T r=1 


(4.44) 


where 


gu(y) = fyf{—r've -(r- 1)’ + r(r — Im + rr — Ly2} + ka) 


2 
az 
(4.45) : 

= 7 ly{(r —-1)n—- rvs + Karl, 


kor = reyaly: + &T) + (r — 1)’nl(m + &T) 
— r(r — 1) yi (yo + &T) — r(r — 1) ye(1i + &T), 


garly) = lyf—rn — rye + rl(r — lye + rl(r + Lyi} + kel 


(4.47) 
7 [yiry = ryes + ker], 


4.48) ke = ran(y + 67) + rve( v2 + 5.7’) 
(4.48 

— r(r — 1)yelm + &T) — r(r + I)yilye + &T). 
In this case 


—(2/T)r((r—1) yi 72) (1 +81 T—( 724827] 
(4.49) Se 


e —(2/ T)r [ (r+1) ¥3—1'79) [1481 T—(724827)] 
(4.50) e i a} lyités ateT)) 





MODIFICATION OF SEQUENTIAL TEST 


fs : e*™ n(y|0, T) dy 
71 1 


= 
| 1 et! AP) Gt +4y| 2 11-779) +h») dy 
nthir V2eT 


x“ 

(2/ T) {{ @—1) v3 —1'7q) ? ky} nn —(1/ 29) (yr? (—D 4-7 79))* 

=z € dy 
yitthit? V/294T 


me gotta al fra Di] E i (Yeas) I. 


=e 


VT 


[ a e*n(y|0, T) dy 


ce 
[ 1 e (27)) (y®+4y(rvi—r72} +4k ae) dy 


yittiT V2T 


5 en tir? rab tyabe) 9 Drab —# (7 +1) 7449) 


[i - o(@7+Or+ bn = 2m) 
VT . 


Then P,(T) follows for y, + &7T > ye + 8:7. In case y, + &7 = yo + &T 
we argue that P,(7') = lim P,(t),t + T andt s T since P,(t) < P,(T),t < T, 
and P,(T) — P;(t) is less than the probability that y. + d¢ < Y(t) < y, + di 
which converges to 0. It will be seen in the discussion following Corollary 4.1 
that when 5,7 + 7, = &7' + 72 the series (4.54) converges as 1/(ar’ + br + c). 
Hence, for t S T, the series (4.54) for P;(t) can be majorized (uniformly in t) 
by a series k/r’, which converges. This proves the theorem. 

The probability of touching the lower line first and before T = ¢, say P,(T), 
is obtained from Theorem 4.3 by replacing (7; , 4) by (—7v:, —4:). 

The probability (4.32) can be written in different ways. One which is con- 
venient for computing is to use Mill’s Ratio 


‘ _ 1 — Oz) 
(4.53) R(z) = - ea)” 


where (2) = n(z| 0, 1). 





184 T. W. ANDERSON 


Corouuary 4.1. 


87 + & iT + ii\ = 
> ( “ee —— -~— 
P\(T) 1 o ( Je) + o( SF) ds 


rl 


i {2(r~ 1)/ 7) (r-yy— CD v9) (81 P71 — (Sa T +79) ) on (r— 1)¥2) — (47 + = 
\ VT 


(2r /T) ((r—1) 1749) (8, T4714 —-(b2 T492)) p| 2m — 7 see - an T - ») | 
(27/7) (r—Dyy—r7al (64 T4971 —- (42 T4972) ) R 
VT 


(2r /T) ((r+1)¥4—r79) (647+ (T+) p | 2X — ¥3) + (8 T+ »)}} 
VT 


2((r — I) — rye) + (AT + »| 


« 
( az + Y 5 —(2r /T) (r+) yy—r79) (6) T4714 a T4922) 
VT r=o | 


JT 


[ p(2e+ Irn — re) — (i T + wm) LR (Xm — 2) +(iT + »)| 


JT 


[2r+]l) /T) levy 


(r+1)¥q) (6, T+71-(42T4+72)) 


[a(t 1)(m — v2) — i T + ») 
VT 


2(ry1 - = + L)y2) + (6 T + w) I. 
R ——— 
+e VT 


Mill’s Ratio R(x) for z > 0 satisfies the inequalities R(z) < 1/z 


_ (Ak — 3) - 
—— te R(z) 


3 (4k — 1) - 
+ elites 5 Co , k=1,2,-- 


Thus for large z R(x) behaves like 1/z (with an error of less than 1/z’). If 
57 +1 = &T + ye, then for r large, the rth term of (4.54) is approximately 





MODIFICATION OF SEQUENTIAL TEST 
MO CEE eer 
(2r + Ij — 2ryv2 — 6 T (27 + I) — 2ryn + 6, T 
(4.56) — E : VE ay ine | 
(2r+ In — 2Ar+ yy —biT © (2r + In — Ar + ly +i T 
-2.V7{ _(2r + 1) = 2rve (2r + 1) — 2(r + 1) 
[(2r + I)y1 — 2ryel? — 8F Tr ((2r + In — 207 + Iya 6} T?, 
which is of the order 1/r’. 

To express the probability of X(t) = Y(t) + wt touching x = c¢, + dt before 
xr = (, + dot we replace y; by c , v2 by 2 , 6; by d; — yw, and & by d, — yw in the 
preceding formulas. If the sequential procedure is symmetric, c; 
say, and d,; = —d, = d, say, and the formulas are simplified. 


Coro.uary 4.2. If y, = —y2 = ¢, 5 = d — pu, and & = —d — 4y, then the 
probability of touching the upper line first before t = T is 


P\(T) =1 -9 (Gow i te) + Efe 2(2r—1) e | (27-1) d—w) 


(2-0 —(4r—-; Be) —¢ 2-2rc\2rd e( 42 = Le) 
VT VT 


el(2r—1)d4+ (d— )T + (4r — Ie 
nt nee #(“ u 2} 
VT 


eg Pere lardtel {1 -+(4= - u)T +(4r + ve) | 


VT J 
=1 -(( awe) 


= staal clin (d — w)T —(28 —1)ec 
+ (—1) af g Bete o( ) 
2 /T 


_ gn teledte [1 - o (= — u)T +(2s+ ve) 
VT )” 


The probability of touching the lower line first before t = T is 


o (ttette) 
P, r = 
—— o( VT 


= —G = ¢, 


oe 


Lg teed [1 - o(@+ u)T + (28+ vey} 
VT 


e=l 


+ 5 (1) {ementing a Ll a 





186 T. W. ANDERSON 


The probability of touching one or both lines before t = T is 


T (T) =1- ortete) (Porte) 
P\(T) + PT) =1 [os ® ner 
= —1)"" | inal | o(eP teeter +e) 
+2 ( ) ‘e ae 
-o(et+ 2sc — arty) 
VT 


(4.59) 9 e™etettn) | (=eE Seat x 8) 


~ o{ of +3e — T+ o\ 
o( VT )| 


f 
— oo ml atl —2e2cd+2ecu E (S + d)T + (2s + Ue) 
1+ 2 (-1)"e a 
aa (G— OT t (ae = Ney] 
VT ; 


These results can also be expressed in terms of Mill’s Ratio. 


Corouiary 4.3. If y = —y2 = c, 5 = d — wp and & = —d — yu, then the 
probability of touching the upper line first before t = T is 


T — (dT + °) (= — (dT + 2) 
P, > = > eS omnes 
(7) ( VT *? VT 


< Ha -2 (1) p (28sec + wT — (dT + c) 
. (4)°**¢ 2(e/T) (aT+e)els "Re ( ) 
id VT 


= s41,—2(e/T) (aT+e)e(et1) pp (28¢ — wT + (dT + c)\\ 
(4.60 — > (m1) te telnet oR (moet re) 
2, re /T j 


on Ps (Tae +21) > (— 1 \'e 2(¢/T) (dT+e)a(e+1) 


s=( 


E (45 pit + < +R Pet r+e)| : 


VT 
The probability of touching the lower line first before t = T is 


VT 





MODIFICATION OF SEQUENTIAL TEST 


P, 7 = @ ar) + (st +6 - —— >) 


is (e/T) dT+e)a(e—1) 28c - pT —_ (dT + c) 
(~1)*te*e/? aT op (2 28 — pi — (Gi ) 
{EC a 
= \ ot —2(e/ 7) (ar tee(e+t) pp (28¢ + wT + (dT + c)\) 
(461) — > (—1)'etener op (eter eetatinn =) 
VT 
rig uT + dT + ‘) . 1 \tp mes 7) GT +e de (etd) 
= o( ae ; 2, ( 1)*e 
7a = {d+ HIT + Pete + pnts) 
, a —@+s)T+e\ 1, 3 
[ VT VT 


The expressions in Corollary 4.3 were used for computation. If ¢ + dT > 0 
(v1 + &T > v2 + &T'), the exponential term in the sum 


exp[—2(c/T)(c + dT)s(s + 1)] 


decreases rapidly and the series converges rapidly. It is an alternating series, 
and the last term used bounds the error. For large z R(x) behaves like 1/z. If 
dT + c = 0, the convergence of the series for P,;(T) and P:(T) is like 

> (-1)'/[((2s + l)e + (d + w)T). 


When 7; + 6:7 = y2 + &T7, we can use the third part of (4.24). In particu- 
lar if y: a c, 5 = d —p, b2 - —d wth 


s=l 


2c (uT+y)/T l 


‘ l 
(468) PAT, y) = carer] * peor 1° 

Corouiary 4.4. If y, = —y2 = ¢, 6, = d — pw, & = —d — pande + dT = 0, 
then 


x 2/(@T) 
e ly ] 


PAT) = eum [ ed een W 


x ~(2% /2) 


os e a 
4) Oe [. 1} eel Pete 


l [ e-l(2-#V 7) 2/2) 


” Vie Leltetiven™ 


e . 
(s eV/F)2/2 elte/V/T)2 


i 
~ Van be t+ calvin ™ 


The expressions in Corollary 4.4 are not very useful since the integration 
cannot be evaluated in closed form. We can give an approximation which may 
be useful for some purposes. 





188 T. W. ANDERSON 


Corouiary 4.5. If y= —7y2 = ¢, 6 = d — p,& = —d — pande + dT = 0, 
then P,(T) is approximately 


int: 
(4.64) *"\e/Se+)" 


where 8 = 1.702. 
Proor. Haley [6] has shown that 


: < 01 


(4.65) }®(2) — 


From Corollary 4.4 we have that P,(7') is approximately 


‘ 


T lg —{y?/(@T)) d 
| 2 (y + wT) Viet’ y 


@ [2c /(BT)) (y+uT) 1 titel g d 
[ [ Qex/T ° —— 


it {u s s tev + 2 


sT B 


But U and Y are independently normally distributed and U — [2c/(8T)|Y = Z 
is normally distributed with mean zero and variance 1 + [4c’/(s’T)]. The above 
is 


* 2cu/ + uJ/T 


Ps 
4a + 1) 
where Z* is normally distributed with mean 0 and variance 1. 

As an example of the accuracy of the approximation, let u = .16449, T = 169, 
and c = 13.2. Then P,(7) = .9453. The argument in # here is 1.639 and the 
probability is .9495. The error is .004. 

4.5. The probability of accepting a hypothesis. The sequential procedure is to 
accept the hypothesis that the mean is large if either the path touches the 
upper line before t = T' or it stays between the two lines tot = T and att = T 
is above some value k. We now proceed to find expressions for this probability. 

Turorem 4.4, If Y(t) is the Wiener process with &Y(t) = 0 and &Y*(t) = 1 
and if 11,72, 51, 62, T, and @ are numbers such that y, > 0, v2 < 0, T > 0, 
v1 + &T = 6 = y2 + &T, the probability of either Y(t) = 71 + &t forat(sT) 
smaller than any t for which Y(t) S yo + dof or yo + bt < Y(t) < yy + &t, 
0 sts T,and Y(T) > 6is 





MODIFICATION OF SEQUENTIAL TEST 


A(T,@) =1-— o(a 
+ Se Bry (17a) br (r—1I8al gy (C246 = od + 2(r — ae - oa) 


r=l 
Bie? (riba trate) — 2-1) baer 798s) (° + ors a ~) 
VT 


2irvs (r—1) 71) [rb (r—1)8a) (= oe a we *) 
T 


asp th 


+ Bir? (vada taba) —9 (rl) y9b1—9 (r+) 1188) ( 


Proor. We have 


d 2/27) 
A(T,@) = [ P,(T, y) Tene” An) g 
(4.69) ¥ Vv : i 
+ [ (1 — P,(T, y)) Vat e WAT) ay. 


that is, if Y(7) s @, the event could happen only by touching the upper line 


first and if Y(7') = @ the event could only fail to happen by touching the lower 
line first. These two probabilities are evaluated as for Theorem 4.3. We find 


—fy? 
e {y?/(2T)} dy 


1 
i P,(T, y) /2eT 
@ + I(r — 1)y2 — rn) 


én = ~ 2[rv3—(r—1) 79] [764 — (r—1) dg) 
(4.70) 2 {e b ( TF 
2ir8 (radi t7a8a)— 1) 11897 78) ‘oume, 2r(¥2 al) 
VT j’ 


aad —(y?/(@T)} 
J, PT.) Tope dy 
6 + rye — (r — uae) 


on —2[rvg—(r—1) 94) [r8g—(r—1) 84) 2s 
(4.71) 2d {e @ ( VF 


rm 


—2[r2 (-¥5 84 +7982) —1 (r- Daahimr(rtt riba) g (= + ra, or ~\ 
T a 


The result of Theorem 4.4 can also be expressed using Mill’s Ratio 
Coro.iary 4.6. The probability of the process touching the upper line before 





190 T. W. ANDERSON 


the lower for t s T or staying between the lines tot = T with Y(T) > 6 is 


4(T,0) = 1 «(J :) 
é ? )= = VT 


[. (2/7) (ryy~ (r—1) v9) [ry P4914 )— (1) (aT +72) “(= — (r — 1)v) sd *) 
VT 


(2/T)r( (yy mye) 1 TH) — aT 472) 1) #11 (2 T4929) 92 6, T4+11-8)} 


2r(v1 — ¥2) — 8 

a a) 

: (2/T) (rl) y1 ry) [Or —1) (6, T+) eet fae = rye] + *) 
VT 


(2 Drie (yaya) (a P91) — a TH 72) 11 aT + 72-9) +72 61 T+ 11-4) | 


R (== =— 9) + ")) 

VT 

The sequential procedure involves X(t) = Y(t) + ut, the lines z = c, + dit 

and = ¢, + dot and a value k att = T (ao + da.7T Sk S oq + ,;T). The 

probability of accepting the hypothesis that u is small (u = —y*) is given by 

the complement of the above probability when y; = c , y2 = ¢:, 6 = d; — u, 

5. = dz — yw, and 6 = k — yuT; then the operating characteristic is L(n) = 

1— A(T, k — pT). 

A case of particular interest is when the sequential procedure is symmetric. 
Then k = 0 (as well asc, = —c, = c, say, and d,; = —d, = d, say). 

Corouuary 4.7. If y, = —y2 = ¢, 6 = d — pw, & = —d — pw, 0 = —yuT, then 


A(T, —uT) = 1 — &(—p VT) 


/ 
> ~2(2r—1)eL (ar —uT — 2(2r — 1)e 
1° 2(2r—1)e[(2r—1)d “le (=# ) 
re JF 


®e-f — — 2-5 
— gt tretard “a ( uT 3 2) 
VT 
— 2(2r—1)e[(2r ld +n] (7 = 2(2r = Ue) 
VT 


9.9 °rd+ T —— 2-2rc ) 
+e 2-2re [2rd “lp (" _ ) ? 
/T ) 


4 


Be % _ —puT — 2c 
= O(u/T) + mS wad, © emery eg "@(- ae ) 
(uv ) 2d \° VT 


2. sd+ — 2 ) 
—e ketal, (= Zo) 2 
VT 


In terms of Mill’s Ratio, we obtain the following: 





MODIFICATION OF SEQUENTIAL TEST 191 


Corouiary 4.8. If y, = —y2 = 6,6, = d — p, & = —d—yp, and 6 = —yzT, 


A(T, —pT) = O(uvV/T) + o(uvV/T) a (—1)**" 


202 (¢/T) (e+4dT) 28c + ar) (= = “| 
p (etal) _p (tesa). 
. l VT VT 


As in the formula for P,(7) the convergence is rapid when c + dT > 0. 
Ife + dT = 0 the convergence is of the order 


ads Vi __ VP _ ryt 
: 2se + wT 2sc — wT 4s? — y?T?- 


(4.74) 


The terms are paired differently here from Corollary 4.3. 


5. Expected time to decision. The time to decision, say r, is a random vari- 
able. If a path touches either line at a time less than 7’, observation stops and r 
is this time; if neither line is touched at a time less than 7’, observation is stopped 
at T. Thus the probability distribution of the time of observation is 


(5.1) Prir st} = P(t) + P(t), 0st<T, 
= 1, 7 8 ¢. 


The expressions for P;(t) and P2(t) given in Section 4 are valid for 7, + dt 
< ¥2 + bot. If: + &:T < v2 + &T’, there is a positive probability that r = T, 


namely, 
(5.2) Prir = T} = 1 — [Pi(T) + P2(T)] = Po(T). 


For t < T, there is a density, namely d[P,(t) + P2(t))/dt. 
Tueorem 5.1. If y; + dt > yo + det (t < 1’), the density of the time of observa- 
tion is the sum of 


aPi(t)_ 1) (at+n\s 
a pa*\ )z 
s le (2r /t) L(r4+1) a1 —r72] ater “Gat (Dp + ly i 2rvz] 


a e [2(r+1) /t) (ryy— (rel) v9) bp tern Gsttra (Op + iy co 2(r + 1)-y2}} 


(5.3) 


a Vt 


P ig * Sera Ft reba ora Serre lia, ak (2r + 1) y2] 


te e RUr thd 111+) n—vr8) Pater Osetra i9( + 1)y is (2r + 1)yal}. 


=) 


dP(t) _ 1 (*! + ) > 


(5.4) 





192 T. W. ANDERSON 


Proor. It is convenient to write P,(T) as (r replacing r + 1 in these terms) 


P\(T) = Sf —atertt)21—rve) (r+), —r8) (; T + 2ry2 — (27 + Un) 
i(T) > e ® enives. pammmmme 


2787185 +92 798q—r (r+1) 7499-1 (r—1) 7981) —6 T + 2ry. — (2r + 1)y 
+e OE eninge 
VT 
a 2 (+1) 2485+ (+1) 2 9bq—r (+1) 1 8—g— (+1) (7 4+2) v98;) 


a(? T + 2(r + 1)y2 — (2r + a) 
VT 


go Blrra— (rH val bb (e400 g, (= & T + 2(r + lye — (2r + Um) | 
VT 


/ 


a a ¥ e MOH naa (HD —vhal ( T + 2ry. — (2r + Las) 


r=0 V/T 


2[r 24.85 +12 y989—r (+1) 452-7 (r—1) v2.51) 


aT + (r+ Wn - 2m, , 
9T?2 é 


7 ee T + 2ry2 — (2r + Un) —6 T + (2r + 1)y1 — 2rv 
VT 278" 


aoa DU (1) B84 + (r+ 1) By gbg—r (+1) 1189— (+1) (742) 7984) 


o(: T +2(r + 1)y2 — (2r + Un) T + (2r + 1)y — 2(r + 1)¥2 
/T OTs 


= tro) ral hi Hal g (= T + 2(r + 1)¥2 — (2r + La) 
VT 


T&T + (2r + In — 2(r + Dr} 
2T%? a 


which leads to (5.3). By interchanging (7; , 5,) with (—y:, —é&), we obtain 
(5.4). 


A characteristic of the procedure is the expected length of time of observation 


(5.7) r= fo ‘Ae dt +f Ee) + TP,(T). 


This section will be devoted to evaluating these expressions. 





MODIFICATION OF SEQUENTIAL TEST 


THeorem 5.2. 


r=f (Ee 


1 —2[ (+1) 14179] [ (+1984 —18 9) (: T + 2ry — (2r + Ln) 
= ® <<<“ 
bs x E VT 
~2fr84y8,-478ygbg—r (P41) 1189-7 Dati, | 8 T + 2rve — (2r + 1) 
— —— es 
(5.8) 


{(2r + ly pan 2rye] ba fassintateleonnet yar soseeteaenetinee 


a(é T + 2(r + 1)y2 — (2r + in) mg Berra oh] 
VT 


4 (=BE +20 +n + Dm) [(2r + 1) — 2r + Dd}. 


When (1 , 8:) and (—+y2, —42) are interchanged, (5.8) is &2 = ff t(dP2(t)/dt) dt. 

Proor. The derivative of the right hand side of (5.8) with respect to T is 
identical to the derivative of the left hand side of (5.8) as obtained from Theorem 
5.1. The theorem follows from the observation that the right hand side of (5.8) 
is 0 for T = 0. (Each term is 0 because each argument of @( ) goes to — «.) 

In Corollary 5.1 below the series is written in another way. It will be seen 
then that when 6,7’ + 7; > &T7' + 2, the exponential terms insure rapid con- 
vergence of the series (and justify integration and differentiation term by term). 
When 6,7 + y, = &7 + v2, the convergence in (5.10) (and hence (5.8)) is 
like 1/r° and in (5.11) is like 1/r*. Thus the series can be majorized by a con- 
vergent series uniformly in 7; &i(t) &:(T), t s T, and the convergence is 
term by term. 

It might be noted that the proof of Theorem 5.2 is essentially a verification 
and depends on the way the terms are paired. The theorem could be proved 
directly from the following lemma, which in turn can be verified in a similar 
manner or can be developed by transformation of the integral: 


Lemma 5.1. 
0 (At) a = 4B = 4/1 — ‘ are) 
VT 2A? VT 
1 ss, (AT —B VT 1 _carseysiar 
+ snome(4 FG ae , mam 


_ 2AB-1 o (47 +8) 7 emi ie AT - Ar) 
2A? /T 2A? VT 


_wvT 1 e 4 T+B)2/(@T) 


A V% ; 


We can rewrite Theorem 5.2 in terms of Mill’s Ratio. 


B<90. 





194 T. W. ANDERSON 


Coro.uuary 5.1. 


oi. : dP, (t) 1 (A742) § 
ae [i d= >¢ SF 2. 


( 
~(2r/T)((r4+1) 94-179) (61 T4+71— (62 T+72! ‘ 
“4 ((2r + 1)" — 2rva] 


R (= + In — 2ry2 — 7 p (2+ Dn — 2mm + & 
(5.10) VT e VT 


— eer /Tilen- Ot es) Pi to Ost +rs)h 1 (Oy + 1)y, — 2(r + 1) ya) 


[a (@ + 1)m — 2(r + 1)y — 7) 
VT 


= (= + lyn — 2(r + 1l)ye + = 
ae oe 


When (7 , 6) and (—y2, —&) are interchanged, (5.10) is & = fo tidP2(t) /dt) dt. 
Coro.uary 5.2. 


ee VT 6, T > m 
ef = 7p) + VP 4 (MT 


oo 
r 7 pens —ryq) (65 T+71— (62T+72)) 
r=O 


oe + In — 2ryr — 4 T R (= + 1)y1 — 2ry2 — & *) 
VT VT 


(5.11) — @r+ Un — 2rn + &T [=> ly — 2rv2 + 3) 
VT VT 


pee e (2(r+1) /T) (r-vy— (+1) va) (6, F471 — (82 T+72)) 


= + In — 207 + Uma? (e+ In — 207 + un= 4?) 
VT VT 


_ @r+ In —2r+ Int eT p (= + Dn = 2(r + Dye + & *) | \ 





VT VT 


When (y; , 8;) and (—y2, —&2) are interchanged, (5.11) is & . 

When we return to the formulation of the sequential procedure of the ob- 
served process, X(t) = Y(t) + wt referred to the lines z = ¢, + dt and z = 
C2 + dot, we use the above expressions with y; = ¢, y¥2 = @, 6: = d; — wu, and 
5, = d, — yw. Again if the procedure is symmetric (¢; = —c, = c,d; = —d, = d) 
the formulas are simplified. 





MODIFICATION OF SEQUENTIAL TEST 195 


Corouiary 5.3. If y, = —y2 = c,b, = d — pand& = —d — wy (ec + dT 2 0), 


* 


ef = =~ DP (-1)"(28 + 1) 
— pf amd 
: f Blot vet esd “le (= — w)T — (28+ 1) ‘) 
\ VT 
gelato, (= (d — w)T — (28 + ue) 
VT 


: Cc dT +cec-— +t) = yer ~2(e/T) (dT +e)e (e+) 
-,{,0(7+*54 2 ( 1)"(28 + le 


[R (ee - ves - - wn - p (Sve + lje + (d — p)T | 
VT 


=_ VT (2 + - a) )*e ~—2(e/T) (dT+e)a(e+1) 
= TP,(T) ere ave, abt > (- 


[2 Ie — pet ernee:s 
‘ VT VT 


_ (28 + MeL — aE » (Ot l)e + (d — yw) | 


VT 


If u is replaced by —p, (5.12) is & . 
The second and third parts of (5.12) were used in computing. As long as 
dT +c > 0, the convergence is very rapid. When dT + c = 0, we have 


VT 


sf = ev) & (-1'@0 +) 


(ea) CF 
= TP\(T) + = vf ; uv T) > (- 


i neta (ie gt) et g(t — ul | 
VT VT VT VT ' 


The first series converges as 


“: vT VT | 
¥(-1'@04 )| ee - 


(5.14 
9.14) . 7 (-1)" —2(28+ 1)(e+ uT) | 
vT (28 + 1)*c? — (uT + c)*’ 





196 T. W. ANDERSON 


the second series converges as 


X(-1) | - (sez rt) + \oee — af 
_)* 4028 + We(uT + ¢)T_ 
2X (-1) [(28 + 1)%c? — (wT + c)**" 


In case c + dT = 0, we can manipulate the series (at least formally) to ob- 
tain 


(5.15) 


* c ee pore eeleel Titel hel (o/T) +n) 
(5.16) & = d—u fs »/2n e Th + etna dv. 


However, this formula is of doubtful usefulness. 


6. Some remarks. The methods used in this paper can be extended to treat 
more general procedures. For example, instead of an upper or lower boundary 
consisting of one line, we could consider the boundary consisting of two lines. 
Let each boundary consist of z = c¢, + dt (i = 1, 2) forO St < Tandz = 
c; + djtfor T < t < T*. The process X(t) [or Y(t)] att = T andt = T* has 
a bivariate normal distribution. Conditional on X(T) and X(T*), the process 
X(t) can be treated in the interval 0 s t < T and T s t < T™ as in Section 
4.3. Then the result can be integrated relative to the bivariate normal distribu- 
tion of X(7') and X(7*). Use of these boundaries would come closer to the 
optimum procedure. 

The procedure that Armitage [2] suggested could also be studied by this 
method. To test » = 0 against two-sided alternatives a procedure is to reject the 
hypothesis if X(t) touches zr = ¢, + dt or z = @ + dt for0 s t Ss T*, where 
¢, > 0, ce < 0 and to accept the hypothesis if X(t) touches z = cr + dit or 
z= c: + dt for T < t S T* where d; > 0, d: < 0 and these last two lines 
intersect at t = 7. (The graph of the boundaries may look roughly like a re- 
versed 5 i Again we can consider the problem conditional on X(T), X(T) 
and then integrate the result. 

For the procedures considered in this paper we can show the following: 

TueoreM 6.1. Jf y; + &T > ye + &T and b, = d — p, & = —d — yu, 


(6.1) oP,(T) 


3 = yP\(T) + &&. 
m 


Proor. This is verified by expressing the various functions in terms of Mill’s 
Ratios and using the fact that R’(2) = xR(r) — 1. 


Acknowledgments. The author wishes to thank Bernard Epstein and John 
Tukey for helpful suggestions. 


REFERENCES 
{1] T. W. AnpERson, ‘“‘The integral of a symmetric unimodal function over a symmetric 


convex set and some probability inequalities,’ Proc. Amer. Math. Soc., Vol. 6 
(1955), pp. 170-176. 





MODIFICATION OF SEQUENTIAL TEST 197 


(2) P. Anmrraae, ‘Restricted sequential procedures,’’ Biometrika, Vol. 44 (1957), pp. 9-26. 

[3] T. G. Donnetty, “A family of truncated sequential tests,’’ doctoral dissertation, Uni- 
versity of North Carolina, 1957. 

[4] J. L. Doon, ‘“‘Heuristic approach to the Kolmogorov-Smirnov theorems,’’ Ann. Math. 
Stat., Vol. 20 (1949), pp. 393-403. 

[5] Eprrortat, ‘The normal probability function: tables of certain area-ordinate ratios 
and of their reciprocals,’’ Biometrika, Vol. 42 (1955), pp. 217-222. 

{6} Davip C. Hauer, “Estimation of the dosage mortality relationship when the dose is 


subject to error,’”’ Technical Report No. 15, August 29, 1952, Stanford Univer- 
sity. 


[7] Wassity Hoerrpine, “Lower bounds for the expected sample size and the average 
risk of a sequential procedure,’’ submitted to Ann. Math. Stat. 

[8] J. Krerer anp Lionex Weiss, ‘Some properties of generalized sequential probability 
ratio tests,’’ Ann. Math. Stat., Vol. 28 (1957), pp. 57-75. 

{9} K. Pearson, Tables for Statisticians and Biometricians, Part I, Cambridge University 


Press, Cambridge, 1924. 





SEQUENTIAL TOLERANCE REGIONS' 
By Sam C. SAUNDERS 


Boeing Scientific Research Laboratories 


Summary. Consider a measurable space with a linear ordering on the space 
and the family of all probability measures which assign measure zero to each 
equivalence class induced by the ordering. For such a space and family of prob- 
ability distributions sequential tolerance regions are defined. The procedure as- 
signs for each finite sample a Borel set with boundaries determined by the order 
observations. The sampling terminates when the region remains unchanged for a 
certain number of observations. The coverage of the region thus sequentially 
determined is distribution free with respect to that family of distributions. Some 
relationships are derived between the distribution of the coverage and the 
generating function of the random sample size, which permit the determination 
of one in terms of the other. This paper includes as a special case the previous 
results of Jifina on the distribution of coverage for his sequential procedure. 
Also, formulae are obtained for the expected sample sizes of the Jifina proce- 
dure which were previously unknown. The results of Wilks for fixed sample 
tolerance limits are obtained as a limiting case and comparisons are made with 
sequential procedures in terms of coverage and expected sample size. For ex- 
ample it is shown that for one-sided tolerance limits no sequential procedure is as 
good as Wilks fixed sample procedure in the sense that if the expected sample 
sizes are the same the coverage of the Wilks procedure is stochastically greater 
than the coverage of the sequential procedure. 


A discussion of past results. Let X be a random variable (r.v.) on an induced 
probability space (%, M, P) and let V = (X,, X2,---, X,) be a vector of n 
independent replications of X; denote the induced probability space on which 
V isar.v. by (%*, U*, P*). If D is a function mapping ¥* into A, then the r.v. 
D(V) has been called a distribution-free tolerance region whenever the distribu- 
tion of the random coverage Q, defined Q = P[D(V)]|, does not depend upon 
the measure P, under the condition that P belongs to some class of probability 
measures. 

Such tolerance regions as outlined above were first introduced by 8. 8. Wilks 
[1] in the following special case: If ¥ is the real line, &% the Borel subsets of ¥ 
and L(V), U(V) are two statistics from ¥* into ¥ such that U(V) 2 L(V) 
almost surely (a.s.), then U(V) and L(V) are, respectively, upper and lower 
8-tolerance limits of probability level a for a, 8 e (0, 1), if the coverage, letting 
D(V) be the open random interval (L(V), U(V)), Q = P[D(V)] is such that 
P*(Q > B) = 1l—a. 


Received April 24, 1958; revised April 10, 1959. 
! Sections of this paper are related to a thesis written at the University of Washington. 
That research was sponsored in part by the Office of Naval Research. 


198 





SEQUENTIAL TOLERANCE REGIONS 199 


Under the condition that P assigns measure zero to all one point sets, Wilks 
has shown that if U(V) = Xu», L(V) = Xw for 8, r positive integers such 
thatn + 1 > r+ 8 > 1 where X:p denotes the jth order statistic of V (order- 


ing from the bottom) for 7 = 1, --- , n, then U(V) and L(V) are defined a.s. 
and 


P*Q s Bl =Ig(n+1—r—s,r+s) 


where 


' t 
(m+ 1,n+1) = (m + = + 1)! 2"(1 — z)"dz for te (0,1) 
min! 0 
denotes the incomplete beta function which is independent of P. 

The work on tolerance limits or regions has been generalized to a great ex- 
tent by A. Wald, H. Scheffé, J. W. Tukey, R. Wormleighton, D. A. 8. Fraser, 
Irwin Guttman, and J. H. B. Kemperman ({2], (3), [4], [5], [6], [7], [8], [9], [10]). 
These investigations dealt with the construction of tolerance regions for multi- 
variate r.v.’s defined on spaces with several distinct generalized orderings; 
however, all results were for a fixed sample size. 

The first attempt at introducing sequential distribution-free tolerance limits 
was made by M. Jifina [11] who proposed the following procedure fo: finding 
tolerance limits L and U under the same conditions which Wilks used. Let 
r, 8, k be positive integers. During the first stage take r + s observations and 
set L” = X,,,, U" = Xa . During the jth stage, j = 2, 3, --- , continue 
sampling as long as . 


(*) LY § Xus 3 UN” 


and i < k where ¢t is the number of observations drawn during the preceding 
j — 1 stages. If (*) holds for i = k, terminate the procedure and set L = L°”, 
U = US”. If Xu < LO” or Xeni > US andi s k, set LD” = Xi and 
U® = Xv4is:-» and continue sampling for the (j + 1)th stage. He has shown 
that this procedure terminates a.s. and that if Q = P{(L, U)] then 


PwiQ > 8) = (1 — B)"** exp {(r + 8) 8'/i} 


where Py» is the probability measure in the probability space on which the 
r.v.’s L and U are defined. 


1. Introduction. Suppose we have a sequence of independent observations of 
a r.v. such that any two of these observations may be compared and the worst 
(in some sense) determined without knowing the magnitude of either, e.g. they 
may be placed on a balance and the lighter one discovered without determining 
their weights. We continue to rank these observations until a particular pre- 
determined arrangement of the ordered observations has occurred, e.g., the 
Jifina case where the worst and the best have been unchanged for a number of 
rankings. The total number of observations ranked when such an event occurs 





200 SAM C. SAUNDERS 


is random. The proportion of the population which will be caught in the region 
so determined is random. We ask ourselves, apart from the mathematical diffi- 
culties one may encounter, what are the distributions of these r.v.’s and how 
may we optimize within such a class of sequential procedures? 

To this end we merely formalize the relevant aspects of the ordered observa- 
tions of a real r.v. with continuous distribution and the related concept of a 
distribution-free tolerance region and its coverage. 


2. The basic sample space. Let P be a probability measure on a measurable 
space (X, %). Following the usual notational convention we shall let X be the 
r.v. on &, i.e., it is the identity function on %, and we shall use it to describe 
events as follows: If x is a proposition involving relations and/or functions for 
which A = {xz ¢ ¥:x(x)} ¢ & then we shall denote A by [x(X)], loosely speaking 
we say [x(X)] is the set of points in ¥ such that the relation z is true. 

Let @ be a balance on ¥. By a balance we mean a triplet of binary relations on 
X, say 6 = (<, ~, >), where ~ is an equivalence relation associated with 
the irreflexive relations <, > such that for each z, y ¢ ¥ exactly one of x < y 
or z > y or x ~ y must hold. The relations in the balance @ = (<, ~, >) 
induce partial orderings on the set of subsets of ¥ and we write, e.g., for 


a? as 


that S < T iff (read if and only if) s ¢ S,t ¢ T imply s < t. 

We say a set Z is dense in & iff z, y eX and x < y imply z < z < y for some 
zeZ. 
(A) We assume there exists a countable set Z which is dense in ¥ with respect 
to 6. 

It follows that if S < T and S U T = &, then, whether or not S or T is 
empty, 
(2.1) S = sup(X <z] or S = inf [X < 2, 


2eZs 2eZT 


and we have also if S’ = S — sup[X << 2] is not empty, then 


2eZ8s 
(2.2) S’=[(X~y] forsome yeT. 


Now in order to assure the measurability of the sets under discussion let us 
define the class $ of sets as follows: 


S$={SC#S <X%— §}. 


(B) We assume the minimal c-algebra of $ is W. 

As a point of comparison our assumptions (A) and (B) imply the assumptions 
(i) and (ii) of Kemperman [10] in his paper on generalized tolerance regions. 

Assumptions (A) and (B) have been made stronger than Kemperman’s so as 
to avoid such measurability considerations as arose in his paper. This is done 
by utilizing the concept of a Lusin space which originated with Blackwell [12]. 
The definition is as follows: a pair (Q, ®) is a Lusin space iff (a) @ is separable, 
i.e., there is a sequence {B,} of elements of ® such that @ is the minimal ¢- 





SEQUENTIAL TOLERANCE REGIONS 201 


algebra of {B,}, and (b) the range of every real-valued @-measurable function 
on 2 is an analytic set, i.e., a set which is the continuous image of the set of 
irrational numbers. 

We have 

THEOREM 2.1: Under assumptions (A) and (B), (%, M) is a Lusin space and 
the atoms of & are the sets of equivalence classes induced by @ on %. 

Proor: Let ¥* denote the set of equivalence classes induced on ¥ by 


6 = (<,™», >) 


and set {Z,}-1 = {{X < z]:z e Z}; hence from (B) it follows by definition that 
Y is separable. That the atoms of Wf are the points of ¥* follows from the defini- 
tion of the atoms of a separable c-algebra. To complete the proof we remark 
that in the natural topology 5, with typical element [y < X < z] we have (%*, 3) 
metrizable. This follows from the Urysohn Metrization Theorem (see, e.g., p. 
125, Kelley, General Topology). Now a metric space is analytic if it is the con- 
tinuous image of the set of irrational numbers (see Blackwell [12]). We define a 
function g on ¥ as follows: 


o(z) = 2 en(z)/3” 

where ¢, is the characteristic function of Z, . Now g is 1 — 1 in the sense that 
g(x) = g(y) implies z ~ y and g is order reversing in the sense that g(x) < g(y) 
implies z > y. Now g is clearly continuous. Express r ¢ (0, 1) uniquely in its 
dyadic expansion r = )>%_, a,/2". Now set h(r) = >-f a,/3". Then h maps 
(0, 1) onto g[X¥] in a continuous manner, and so %* is the continuous image of 
the function gh. But since the open unit interval is the continuous image of 
the irrational numbers the result is proved. 

Now we remark that the atoms of a Lusin space need not be points and they 
are not in this case. We remark further that a Lusin space ensures a regularity 
which along with other advantages permits the identification of Borel and 
Baire functions and ensures the existence of conditional expectations. 

The function defined on ¥ by F(z) = P[X < 2] is the distribution of X and 
F maps & into the unit interval. 

TuroreM 2.2: Now U = F(X) is ar.v. with the uniform distribution on (0, 1) 
iff P(X ~ zx) = 0 for each zx € %. 

Proor: If for some z ¢ X¥ we have P(X ~ z] > 0 then F is not onto (0, 1) 
and U cannot be uniform. Let u e (0, 1) and set 


S = (F(X) s ul, T = (F(X) > wu). 


Then making use of the properties (2.1) and (2.2) the proof follows. 

A balance @ was said by Kemperman to be continuous with respect to the meas- 
ure P iff P|X ~ z] = 0 for each z ¢ &. 

Let @ be the class of probability measures on & which are continuous with 
respect to the balance @ and hereafter let P denote generically an element of @. 

We have a r.v. X defined on the probability space (%, A, P) where P ¢ @. 





202 SAM C. SAUNDERS 


Let W denote the set of positive integers. We write Xw = (Xi, ---,Xa,°-- ) 
for the r.v. on the probability space (Xw, W%w, Pw) where Xw is the countable 
cartesian product of ¥ with itself, Ww» is the o-field of subsets of Xw generated 
by all measurable cylinders in ¥» and Py is the product measure on Ww» gen- 
erated by P. From Blackwell’s paper [12] we have the following: 

THEOREM 2.3: (¥w, Uw, Pw) as defined above is a Lusin space. 

For zy ¢ Xw we label z;,, as the jth ordered element determined by @ from 
wi = (%,°**, 2m), Where 2, < Zj41., forj = 1,--- ,n — 1. Thus a balance 
allows the determination of the r.v. X;,, which is the jth order observation of n 
with respect to 6 from the random vector X% . 

Now extending our descriptive notation to elements of Mw for a given n e W 
we set Kj) = [X; < Xz < --- < X,] and then let 


Ki) = [X;, < Xi; Zee < X;,) 


for each of the n! permutations (4) = (4, ---, i,) of (1, 2,---,m) 

We will say that B, ¢ Uw is a simplicial set over {1, --- , n} whenever there 
exists a set ¥ which is a subset of {1, 2, --- , n!} such that B, = U,;.4 K;(?) as. 
Now any simplicial set is a cylinder set and except for a set of probability zero 
is the union of simplexes and as such is a set which may be defined by arrange- 
ments of the ordered observations. If we let c(y) denote the cardinality of the 
set y as defined above, then from the independence of X,,--- , X, we have 
Pw(B,) = e(p)/nt. 

An event A ¢ Uw is said to be of structure (d) on T = {t,,---, ta} C W iff 
there exists a measurable relation 6 symmetric in its arguments and defined on 
the unit cube such that for any Pe ®, A = [6(F(X,,), --- , F(Xe,))] as. This 
nomenclature is adopted from Birnbaum and Rubin [13] because of the obvious 
similarity. 

We have 

TxHeoreM 2.4: If B, is some simplicial event and A, is an event of structure 
(d) on {1, --- , n}, then the two events are independent. 

Proor: It is sufficient to assume that for some y we have a.s. 


B, = U;., Kt ‘ 
Now by the disjointedness of the K7;) and the nature of A,, 
Pw(A,B,) = x PwiKty M 5(F(X,), --- , FCX,))) 
je 


= 1 YS mp ol6(F(X.0), ++) P(Xe.e))) 
Nijeyv 


1S Pel R(X), +> PX) = WY Pycay 


ni 


and hence we have independence. 
Txerorem 2.5: If B is a simplicial event on {1, --- , n}, B* is a simplicial event 
on {n + 1,---, m} and C = AA* where A is of structure (d) on {1,--- , n} 





SEQUENTIAL TOLERANCE REGIONS 203 


and A* is of structure (d) on {n + 1, --+- , mj, then the events B, B* and C are 
independent. 

Proor: This follows immediately from the preceding and from independence 
of the components of X » . 


3. Sequential sampling plans. Let S = (S,,---, S,,°°: ) be a sequence 
of disjunct measurable cylinders in ¥» such that each S, is a simplicial set 
over {1,--- , n}. We call such a sequence a sequential sampling plan and the 
events S, stopping sets. The stopping rule for our sequential sampling plan is: 
stop sampling after the nth observation iff z, ¢ S, . Because S, is a cylinder set 
over {1,---, mn} it is always known after n observations whether or not the 
event S, has occurred. We wish to choose S so that for each P ¢® we have 
> P(S,) = 1, i.e., sampling terminates a.s. We define the r.v. N on Xw into 
W by N(a.) = n iff x, ¢ S,. N will be called the random sample size and its 
distribution will depend upon our choice of S. 

We now exhibit an obvious lemma for later reference which concerns the 
construction of a sequential sampling plan from a sequence of simplicial sets. 

Lemma 3.1: Jf B = (B,,---, Ba,+++ ) ts @ sequence of events and B, is a 
simplicial set over {1,---, n} then S, = B,f\ Ja B; for ne W defines a se- 
quence S of disjunct simplicial sets, and if we write 


Pn = Pw(S,), qn = Pw( 1 B;) for neW 


j=l 


it follows that 

(i) Pa = Go-1 — Qn forn e W where gq = I, 

(ii) >> p, = 1 iff lim, gq, = 0. 

Let D be a function which maps W X %w into & which for fixed n ¢ W isa 
function of z2 only. Let us write D(z). The coverage of D is defined by 


Q(z) = P(D(zx2)). 


Hence for a given sequential sampling plan S which determines a random 
‘ sample size N we have a sequential tolerance region D(X’y) as a set valued r.v. 
on (Xw, Mw, Pw) taking values in & and the random coverage Q(X) is a r.v. 
on the same probability space but assuming values in the unit interval. 

We now begin a construction of S and D in terms of simplicial sets. Let 
b = (bh, ---, ba, --++) be a non-decreasing sequence of positive integers such 
that 

1° n S b, for every n ¢ W and if b, = n for some n ¢ W then 6,4; = n + j 
for all 7 e W, 

2° lim, b,/n = 1. 

We call b, a stopping number and b is the sequence of sample sizes at which 
inspection takes place to ascertain if a stopping event has occurred. 

For z. ¢%w we define a subset Aj(zi) = [zi1.. < X < z;,| for j = 1, 

-,n + 1 with the obvious definition of Ap, A,4; . Due to the continuity of 





204 SAM C. SAUNDERS 


6 with respect to P ¢ @ it follows for each n ¢ W that A;(X%) is a statistically 
equivalent block. This nomenclature follows from: 

Lemma 3.2: For fixed ne W write U; = F(X;.,) forj = 1,+--, n. Then 
the random coverages for each j = 1,---, n + 1 defined by 


Ci(Xw) = Pw[A(Xw)] = Uj — Uj 


(where we set Up = 0, Unsi = 1) have the following properties: 

(i) DiC Xy) = 1, OS C(XF) S1 as. for j =1,---, n +1, 

(ii) the distribution of the C ;(X‘w)’s is completely symmetrical, 

(iii) Q(X%) = Doha Ci,(Xw), where (Ci, «++ , Ci,,,) 18 any arrangement of 
(Ci, +++, Casi), has the distribution P»[Q(X%) S q) = I,(k, n — k + 1) for 
0<q<l. 

Proor: From Theorem 2.2 we know that U,’s constitute a set of ordered ob- 
servations from the r.v. with uniform distribution on the open unit interval 
and the properties (i), (ii) and (iii) are known consequences of this fact (see 
Tukey [4}). 

Let X = (Ai, °°: ,An,°**) be a sequence of subsets of W such that there ex- 
ists an 7 ¢ W and X, is empty if n < 7 and X, is a non-empty subset of {1, --- , 
n+ 1} if n 2 », and further, such that X, = {1,---, » + 1} — A, has the 
property that for all n = , c(X,) = ». We define D in terms of \ by 


D(z?) = U A,(z%) for n = 7, 
JEAn 
and if additionally we have for each n e W, D(z2) C D(zxt*"), then A will be 
called a selection sequence with deletion number n. Now i, tells us what union of 
statistically equivalent blocks forms our tolerance region after n observations 
and 7» is the number of statistically equivalent blocks which are deleted, this 
number remaining constant. The monotone restriction on D requires that the 
tolerance region not decrease with increasing sample size. 
It is obvious that for each n ¢ W, zy € Xw 


Q(z.) = P(D(ze)) we? P(A;(xe)) = 


C (xd) 
X 


7 € An 


and the r.v.’s so defined have the properties specified in Lemma 3.2. 

We now describe a simplicial event in terms of the tolerance region and a type 
of stability, that is a simplicial event has occurred when after m observations 
the tolerance region is the same as it was after n observations (m 2 n). More 
formally, 

THEOREM 3.3: Let a selection sequence d with deletion number n be given and 
for fixed integers n, m such that n Sn Sm set Ba. = (D(XG) = D(XF)). 
Then Bx.» is a simplicial set over {1, --- , m} and 


rates = (')/(0) 





SEQUENTIAL TOLERANCE REGIONS 205 


Proor: Now D(z) = D(zxt) iff 2; ¢ D(xt) for each j = n+ 1,-+-+, m. 
From the independence of the X,’s we have 


Pw (X; 2 U Ajxd)) | X> = 2} = ( C(22))”™. 
i—_n+l1 Jams Jt. 
By the properties of conditional expectation for Lusin spaces we have from 
Lemma 3.2, making the substitution z = >> C,(z2), 
J tn 


1 


Pe(Bua) = [ 2 *dl(n — + 1,9) 
and integration yields the result. That B,,,, is simplicial is obvious. 

A couple (A, b) we will call a decision rule and this nomenclature we make 
clear momentarily. For a given decision rule we can use the simplicial sets as 
defined in Theorem 3.3 to obtain a sequence of such sets by taking m = b, 
for each n 2 ». Since we can without confusion omit the second subscript, let 
us do so and write B,, for the event as described. Let us define B,, = ¢forn = 


1, --- , » — 1. Now from the sequence { B,,}%.. we can use Lemma 3.1 to con- 


struct a sequential sampling plan {S,,}. 

We have 

Tueorem 3.4: If from agiven decision rule (d, b) the sequential sampling plan 
{S,,} ts constructed in the above fashion then Pw(sup Sy.) = 1. 

Proor: From (ii) of Lemma 3.1 it is sufficient to show that q@, — 0 since 
S,, is simplicial. Now 


= Pw(f Bs,) S Po(Bs,) = 1 — Pr(Bs). 


Since we know 


+. —* 
lim = ii = 1 forany heW, 


it follows from Theorem 3.3 and the definition of stopping numbers that 


. ._  (b, — 9) In! 
lim Py»(B.) = lim ia 
We remark that we are assured of stopping sampling at the least n e W such 
that b, = n if such exists. Further: the proof of this theorem justifies the intro- 
duction of assumption 2° in the definition of {b,}. 
We examine more closely the structure of the sampling plan {S,,} in the fol- 
lowing primary: 
THEOREM 3.5: For a given decision rule (X, b) let a be the function defined by 


{max{j eW:b; s n-—-1l1} forn2h+1 
\0 n=l,---,b. 


o, = 





206 SAM C. SAUNDERS 


In words: o, is the largest subscript of the {b;\ for which b; < n if n > b; and 
otherwise it is zero. Let L,, be the set defined a.s. by L, = [X, ¢ D(X")]. Then 


Mba im" 


n-l on 
S = BAN B, = B,NL.AN B, n n 
_ im? 
¢ n<n. 


Proor: By construction in Lemma 3.1 we have only to show the equivalence 
mentioned for n 2 7. Let n be fixed and denote the right-hand side by A. To 
prove that A = S,, we will show that 


k—1 
L,< NB, andthat S,, Cc A. 
n=op+l 
We have from the definition an equivalent expression 


bn 
B,, = fl [X; « D(X%)] 
tmn+1 
but since D(X%) C D(X%") a.s. we have that L, C B,, for all j such that 
k >j> ox, and hence Lx C N5-2,4: B,, . Now we see that L, M Bs, C Bs, 
by noting that 


bn 


[X, e D(Xw')] Nf [X; e D(X*w)] 


+1 


is contained in f'25' [X; ¢ D(Xw")], which proves the result. 
We now have: 


Coro.uary 3.6: Using the notation introduced previously we have 


Pr, = P(Ss,) = gen S ds are for neW 


with the understanding that (") =Oifn<n. 


Proor: By the preceding theorem we have 


DP, = P w( Bo, Nn L. n n B,,). 
19 
Since (L, N jz, B,,;) is a simplicial event on {1, --- , n} and By, can be ex- 
pressed as an event having structure (d) on {1, --- , n}, independence follows 
by Theorem 2.4. Apply this argument a second time along with Theorem 3.3 
and simplify to obtain the result. 


4. The distribution and generating function for a decision rule. Let r = 
(», A, b) bea decision rule where ) is a selection sequence with deletion number 7 
and b is a stopping sequence. This redundancy of notation has advantages 
as we shall see. Let R be the space of decision rules. Once r ¢ R is chosen, the 
random sample size N, the tolerance region D(X), and the coverage Q(X) 





SEQUENTIAL TOLERANCE REGIONS 207 


[all are r.v.’s defined on the probability space (Xw, Uw, Pw) and all are func- 
tions of r, a fact disguised by our notation] are necessarily determined. 

We now define three functions on (0,1)  R: the distribution G of the cover- 
age, the generating function M of the sample size and a derived function ® 
which is determined from M. 


G(8,r) = PwiQ(Xw) s 4}, 
M(8,r) = E(8"), 
#(8,r) = M,(B,r)/(n — 1)! 


where r = (», A, b) and subscripts of M denote derivatives with respect to 8. 
We now exhibit the main result concerning these functions: 
THeoreM 4.2: If r = (», dX, b) is fixed, then 


G(B,r) = 2 Prla(bn +1—4,7), 


M(8,r) = 2d p».8, 


H(t, r) = n > E - tan 
ame 1 
However, we have these relationships holding: 


Y a < (1 oar Bg)’ rl 
G(6,r) = 2, M (8, r) — = [oa — t)” &(t,r) dt, 


M(8,r) = [ (8 — t)”*(t,r) dt. 


Proor: Let N be the random sample size determined by r. Now by definition 
N(x.) = b, iff x. € Sp, so the stated result for M is immediate. Now 


G(B,r) = PwlQ(X) S$ 8] =D Pw(SIQ(Xm) < 6l) 
and by utilizing theorems 2.4 and 2.5 we have 
G(8,r) = Do mPwlQ(X¥) S 8|Xw eS] 


and by Lemma 3.2 the result for G follows. That # is as claimed follows from 
Corollary 3.6 and the definition and the remaining equations follow from re- 
peated integration by parts. 


5. A derivation of the Wilks and Jifina tolerance region procedures. We shall 
call any family x of decision rules a tolerance region procedure wherever one can 
make G(8, r) for given 8 ¢ (0, 1) arbitrarily small by proper choice of r e x. 
This definition includes fixed sample procedures as well as sequential procedures. 





208 SAM C. SAUNDERS 


We now use our results to obtain the known results. 

Procedure 1: (Wilks [1]) Let x = {(n, A, 6) e R: b, = k for all n S k, for 
some k ¢ W}. Now any b lias only one stopping number, say k, hence this is a 
fixed sample procedure where the k observations are drawn and the tolerance 
region determined by using 4. 

If r = (nm, A, 6) and b, = k for all » Sn Sk, then we have G(8, r) = 
Is(k + 1 — n, n), E(N) = k. The proof is re since by definition we 
have G(8, r) = Ip(k +1 —n, 1) > *-1 ps, and 


M(8,r) = 6, ®(t,r) = o(P) en ; 


It is clear that this is a procedure since the parameter k can be taken arbitrarily 
large for each fixed 7. 


Procedure 2: (Jifina [11]) Let x = {(n,A, 6) e R:b, = k + nforeachn e W, 
for some k ¢ W}. This is a fixed increment procedure in which sampling stops 
if the tolerance region obtained from the first n observations remains fixed 
during the observation of the next k. 

We shall show that for any r ¢ x for which r = (7, A, b) and b has increment 
k ¢W such that 7 < k + 1 then G(s, r) = 1 — (1 — 8)" exp {n>,5-1 87/3) 
and if » 2 2 then we have 


1 fa 
E(N) = n(n — 1) I (1 — 2)" exp | 9 2 és} dt 
j= 
and for » = 1 we have 
E(N) = exp (Ljas"} 


We now turn to the proof of the above results. In the light of the Theorem 
4.1 it is sufficient that we determine the derived function ® since from it we 
can determine both M and G. 


Let us fix r ¢ x as described above and omit its mention. By Theorem 4.2 we 
have 


&(t) = 7 > - 3 et " 


where by definition 


‘ fornsk+1 
Co, = 


n-k-1 forn=k+2 
and q, = 1 for n = 0, 1, --- , » — 1. Therefore, it follows that 


&(t) = tn > (" “ft tae 





SEQUENTIAL TOLERANCE REGIONS 209 


To determine ® it is sufficient to determine y where @(t) = fwy(t). There- 
fore, using the definition of « given above we have, since 7 < k + 1, 


k+1 eae 
ro Er e+ ECT eee 


" = b- = Ne ig > (" 7 teat * 


n= ane WO 1 


Let y = f + g where f and g are, respectively, the first and second terms in 
the expression above. Then, using the prime notation for derivatives, 


y(t) = z - ') (n — 9)" 


n— 1 
+E (Cts bei - wae 


n= \ 


k+9 (n—1—9) ~ 
aie > ih ‘ye + " > eer 
n=—@ 


n=—o+l 


Using the recursion relation g, = ga-1 — P» and the result of Corollary 3.6 
we have upon simplification 


g(t) on ae eg - ese n+k—y 
n 2». n Ga 2X a= 3 Qué 


f ("¢ ‘Ve + > “ 1+ ‘) qt 


a=* 7-1 


Now by using the recursion relation for the binomial coefficient one has that 


Xo (" ‘ : ‘) qn ON = g(t) + = 9'(t) 


ce _(*t* 


) e+! g(t) + g(t) — f(t) — w(t) 
n n n 


and substitution shows that 


en (") c+ “ 9'(t) + g(t) — tf) — tg(0). 


7 n=—_ \7 


Again using the recursion relation for the binomial coefficient one sees that 


> (") f= f(t) + : f(t), 





210 SAM C. SAUNDERS 


hence we have that y'(t) = ty'(t) + (1 — t*)my(t). Regarding this last ex- 
pression as a differential equation and integrating over (0, t), we have, since 
(0) = 1, 


v(t) = exp {ndij-1 0/3} 
and hence 
&(t) = nf exp (nD j_1t'/j}. 
Integrating fi (1 — t)” y(t) dt by parts will show that 


8 
anf (QO er(t) dt = 1 (1 - 6)'r(8), 
0 


and hence we can now use theorem 4.2 to obtain 


G(B,r) = 1 — (1 — B)" exp { 25-1 87/4} 


and 


1 
E(N) = n(n — 1) I (1 — t)” *t'y(t) dt for n = 2, 


which gives us one result. 
As a consequence of theorem 4.2 E(N) = G,(1, r) for 7 = 1. We now must 
check only this case, and hence G(8, r) = M(8,r) = 1 — exp {y(8, k)} where 


k 


8 
y(8,k) = —In(1—8) — > 6//j = I f( — 0)" at. 


j=l 


Therefore 


E(N) = hi = lim ; an ‘iene 
6-1 ¢ f » ™®) B-1 — rey 
exp, — > 8/3) 
j=l ) 
which is the result claimed. 
We notice here that for every a, 8 ¢ (0, 1) and every 7 ¢ W we can choose k 


so that G(8, r) S @ by simply taking & large enough that 
> 6/5 = tIn(1 — a) — in(1 ~ 8). 
j=l 
That this is always possible can be seen from the fact that 
lim 505-1 6’/j = —In (1 — 8). 
kw 


6. Comparison of sequential procedures. In what we have considered so far we 
have determined the random sample size by a sequence of events which obtain 
or not from the sequence of observations. Since it follows from the above that 
r eR determines a random sample size N, and 


G(8,r) = PwiQ(X%") S 6] = Ev,P(Q(X2) Ss B|N, = nl. 





SEQUENTIAL TOLERANCE REGIONS 211 


It follows that for any independent r.v. N with the same distribution in W as 
N, the randomized decision rule r* = (n, , N) would have the same distribution 
of coverage and generating function. 

This extension of decision rules allows us to encompass a wider class for com- 
parison. 

If r, r’ are decision rules we say they are comparable iff E(N,) = E(N+-). 
Now for comparable rules we see that r is better than r’ at 8 ¢ (0, 1) iff G(8,r) < 
G(8, r’). We modify our nomenclature in the natural way when the inequality 
holds uniformly for all 8 in some interval and a rule is said to be best when it is 
better than all the rules in some set of rules. (In the following sections primes 
affixed to the characters , G and M do not refer to derivatives. ) 

TueoremM 6.1: Let r, r’ be decision rules with associated functions ® and %’, 
respectively, such that both have number n. If there exists By ¢ (0,1) such that &’ < @ 
on (0, Bo) and @ < ® on (Bo, 1), i.e., Bo is the unique zero of & — # in (0, 1), 
then and only then is G’ < G on (0,1). 

Proor: By the results of Theorem 4.2 we have by letting h = @ — #’ that 
h > 0 on (0, Bo) and —h > 0 on (8, 1) and hence for each 8 e (0, 1) 


6 1 
G(s) — @’(8) = [ (1 — t)""h(t)dt = — | (1 — ¢)""h(t) dt, 


with the last equation following, since fj (1 — t)""h(t) dt = 0. From the ex- 
pression above necessary and sufficient conditions follow. 

We also mention the immediate: 

Coro.tuary 6.1.1: If &, ® are defined as above and are such that ® > © on 
(B2, 1) and ® < © on (0, 8) where B, < 62, i.e.,& — & has more than one zero 
on (0, 1), then we know that G’ < G on (0, 8) and on (f:, 1). 

We now have a criterion for comparison of error functions in the terms of the 
associated functions # and we have immediately: 

Coro.uary 6.1.2: If 9 = 1 and M,(1, r) < M,(1, 1’) then there exists one 
Bo € (0, 1) such that G < G’ on (Bo, 1) and if » = 2 and M,(1,r) < M,(1, 1’) 
then there exists a Bo such that G < G’ on (Bo, 1). 

We shall say of two procedures x; and xz that x, is better than x. at 8 ¢ (0, 1) 
iff r; € x1, 72 € X2 are comparable implies G(8, r,;) < G(, rz). Further we shali 
say x; is uniformly better than x2 on (0, 1) iff comparability of r; ¢ x; and rz € x2 
implies the inequality holds for all 8 in the unit interval. 

Jifina claims in [11] that for number 7 = 1, Wilks’ procedure is uniformly better 
than his own on (0, 1) but any number 7 2 2 his procedure is better than Wilks’ 
for 8 sufficiently close to unity. However, he assumes firstly, that the procedures 
are comparable by disregarding the difference between 7, the ASN for his decision 
rule, and [y] (the greatest integer less than 7), the ASN for the Wilks’ decision 
rule, and secondly, makes no mention of the consideration that the neighborhood 
of 1 in which his decision rule is better might well depend upon +. 

We know that exp |>-5.1j}, which is the expected sample size for » = 1 
for the Jifina procedure, is not integral for any k ¢ W and hence the two pro- 
cedures are not comparable for 7 = 1. From the complexity of the expression 





212 SAM C. SAUNDERS 


for the ASN for » 2 2 given in the procedure 2 it is not apparent that for any 
value of » one would find the procedures comparable. 

However, lack of attention to these details does not vitiate Jitina’s theorem. 
In fact, by modifying the argument slightly so as to assure comparability and 
with slight adaptation, Jifina’s proof applies to any other procedure. 

Let y > 7 and 9 ¢ W be given as an ASN and deletion number, respectively. 
Now define N by letting m = [y] and setting 


(m with probability s 
N =; 
lm +1 with probability 1 — s 


where y — m = 1 — s. Such a rule we call a randomized Wilks decision rule 
r = (mn, A, N). Let N’ be any other random sample size which assumes value n 
with probability p, such that >-%_, np, = +. 

We quote in our terminology: 

THEoreM 6.2 (Jifina): For » = 1 a randomized Wilks procedure is uniformly 
better on (0, 1) than any other comparable procedure for every expected sample size. 

Txueorem 6.3 (Jifina): For » 2 2 and a randomized Wilks decision ruler = 
(n, \4, N) and any other comparable decision rule r’ = (n, r, N’) there exists a 
unique By € (0, 1) depending on (r, r’) such that r is uniformly better on (0, Bo) 
and r’ uniformly better in (Bo, 1). 

We shall not concern ourselves with the original proof of these theorems since 
it is lengthy and an alternate proof will be given later. 

This last result leads us to: 

Tueorem 6.4: There does not exist a decision rule with number » = 2 which is 
uniformly better on (0, 1) than all comparable decision rules with number n. 

Proor: Suppose that we have such a rule with associated function ’ and ASN 
equal to y. Then by Theorem 6.1 if @ is the associated function of any other 
comparable rule and h = ® — ®’, then h must possess exactly one zero at, say, 


B « (0, 1), and h > 0 on (0, 8) and —h > O on (8, 1). But by Theorem 4.2 
we know 


(*) [ @-)a-orn d& = 1-7 =0, 
0 

and we also have 

(oe) I (1 — )""A(t) dt = 1-1 =0. 


Therefore, letting g(t) = (1 — t)"°h(t) we have from (*) fig = f3 —g and 
from (*#) and (##) f5tg(t) dt = f} —tg(t) dt. But 


[ow a < sf o - af —g:< [ —tg(t) dt, 


which is a contradiction and proves the result claimed. 





SEQUENTIAL TOLERANCE REGIONS 213 


For his procedure Jifina has constructed tables of the value of the parameter 
k needed to attain a value of the error function less than 0.1, .05, .01 for values 
of 8 equal to 8, .9, .95. Of course the question is, how does the point 8» as de- 
fined in theorem 6.3 behave as we alter k? 

This can be partially answered as follows. 

Tueorem 6.5: Let N be any random sample size and N' a random sample size 
with the same expectation which has positive probability at no more than two in- 
tegers which are adjacent. Now define r, = (2,4, N + n), r, = (2,0, N’ +n) 
as translated rules where the number of both is » = 2. Then for any 8 € (0, 1) there 
exists m € W such that n > m implies G(8,r,) > G(B, rn). 

Proor: Let h,(t) = G(t, r.) — Git, r.); using theorem 4.2 we have 


ha(t) = U1G(t, 7») — G(t, ro) + ~=! n(MUt, ro) — Mit, 6))I. 
But it follows by theorem 6.2 that M(t, ro) > M(t, ro) for all t ¢ (9, 1) hence 
for any 8 ¢ (0, 1) we can, by taking n sufficiently large, force h, to be positive 
and hence r, is uniformly better than r, in (0, 8). 

Since one is usually concerned with values of 8 near 1, one might be led to 
think from theorem 6.3 that in practical tolerance estimation situations with 7 = 
2 one could advantageously use a sequential procedure. Unfortunately, how- 
ever, we are also interested in having a small which forces the ASN to be large. 
To help clarify this situation we examine G(§, - ). 

We know that for any r e R with number 7, we have 


G(8,r) = D> pals(n + 1 — 9, 0), 


8 ' 7,7 Ss k 3 


by a well known identity, clearly G(8, r) is only a linear combination of points 
on the graph of Js(-, 7). We now examine this function. 


Set f(x) = @ > Ir (;), where y = (1 — 8)/8. We wish to find for 6 


fixed, the values of z > 0, where (1) f is convex and (2) f is concave. Clearly 
(1) iff f”"(x)8* > 0, (2) iff f’(2)8* < 0. Now upon taking derivatives we 
have for all » ¢e W 
9-1 a1 kk+1 
f’(x)-8* = (In a © (2) y¥ +2Ine us S; iz’ 
ko \F I 


tl 


1 
k=O 
k k+l 


wl 
+> > Mili — 1) 2? 
t=o k! 


t—2 


where the Sj are Sterling numbers. 





214 SAM C. SAUNDERS 


Evaluation in the following special cases yields: if 
= 1f"(z)-6* = (InB)’ 
= 2 (In 8)*(1 + 8)’ 


= 3 = (in )*[ 1 + yx +4 (2? -2)|+2%a[7+7 (22 — » +7 


We remark that for 7 = 1 f is a convex function for all z > 0 and hence that 
the Wilks’ procedure is uniformly best on (0, 1) which is of course in agreement 
with theorem 6.2. We also have: 

Tueorem 6.6: If r = (2, , N) is such that Py[N = n] = Oforn Ss [zs] + 1 
where tg = —2/ln 8 — 8/(1 — 8) and r’ is the comparable rule with positive 
probability at no more than two adjacent integers and both have numbers yn = 2, 
then r’ is uniformly better than r on (0, 8). 

Proor: Using the equation above for » = 2 we have 


a —2 B 
f’(z)B*>0 iff t>— 8-13 =. 

Hence r’ is better than r at 8 and by the results of theorem 6.3 it must be uni- 
formly better on (0, 8). 

The theorem above also throws light on the results of theorem 6.3 as to why 
r’ is not uniformly better on (0,1). 

As an aid in computation we prove: 

Coro.uary 6.6.1: With f and xg defined as above for n = 2 we have 


(1) f(xp) = 2e"[1 + (8 — 1)*/8 — (6 — 1)°/8 + o(1 — 8)‘, 
(2) ap = (1 — 8) — (1 — B)/6 — (1 — B)*/12 + o(1 — 8)’. 
Proor: (1) We write f(zs) = 2e°h(8) where h = ge’ and 
9(8) = —8 In [8/(1 — 8)). 


Expand h in a power series in terms of g and simplify and one obtains the result. 

(2) We write h(1 — 8) = (1 — 8)zg which has a power series about a = 
1 — 8. Expanding and simplifying yields the result. 

A further result on the comparison of two decision rules in the case of the 
number 7 = 2 is as follows. 

Tueorem 6.7: Letr = (2, X, N) be given where p; , De > 0 for some j +1 < k, 
let us define r’ = (2, A, N’) by Pw[N’ = nj = Dn = O with pj = pj — & Pa = 
Pm + 6 Pi =pite, D- = p, — eand Dn = pn for n elsewhere, and we suppose 
thaaj <msl<kandm—j=k-—1= 8s, then?’ is uniformly better than 
r on (0, (j/k)*") 

Proor: Let >>’ = Doxew where W’ = {n ¢ W:n ¥ j, m, l, kj. Let jm™ = 
5, ik = o, ml” = y, k — m = | — j = vs where v is some rational number. 
From the above follows 


0<y7s81, 0<i<e¢ <1, 1 + yo5 = o(1 +7). (*) 





SEQUENTIAL TOLERANCE REGIONS 


Let us denote M(-, r’) by M’ and similarly for G’, M, aad G. 

M'(8) = Y) paB™ = 3’ pb” + (ps — €) + (Pm + €)8” + (mi + €)8' 

+ (m — «)s 

M’(8) = M(B) — 63’ + 8" + 8 — * = M(B) — (1 — B)(1 — 8") 

E(N') = E(N) — gf + em+d— & = E(N) + e(m — j) — ek — 1) 
hence we have r, r’ comparable. 
G’(8) = M’(8) + (1 — 6)Mi(8) 

= M(8) — (1 — #’)(1 — 6”) 
+ (1 — 8)[Mi(8) + mes?"(—6 + 6") + ekp'"(o — 8"). 


Now upon simplification we obtain G’(8) — G(8) = «*"g(8) where we have 
g(8) = —B(1 — B’)(1 — 8B”) + (1 — 8)[m(8" — 6) + B"k(o — 8°)). Wenow 
must examine g; setting t = 6", h(6") = g(8) we have 


A(t) = -* "(1-1 -—¢) + A — eat) 


where f(t) = (1/k)[m(t — 6) + tk(o — t)] = t(¢ — t) + yo(t — 8). Using 
(*) above we see that f(t) = (t° — yob)(o — t) + (1 — o)(t — yod) from 
which we can see that ¢ < yd implies f(t) < 0. 

Set & = yoé therefore h < 0 on (0, &) and we have g < 0 on (0, 8») where 


8) = ts and hence G’ < G on the same interval. 

Coro.uary 6.7.1: If in the above theorem the number n = 1 then r’ is uniformly 
better than r on (0, 1). 

Proor: The result follows immediately from the identity 


M'(8) = M(8) — B’(1 — 6°)(1 — 8") 
used in the preceding argument. 


These last results have been carried out for the simpler cases for numbers 
n = 1, 2, and clearly for larger 7 the results are more tedious. 


Acknowledgments: The author is much indebted to Prof. Z. W. Birnbaum, 
of the University of Washington, who first proposed this problem and directed 
the thesis research and then allowed this paper to be published under my name. 


REFERENCES 


{1] S. S. Wiixs, ‘‘Determination of sample sizes for setting tolerance limits,’’ Ann. Math. 
Stat., Vol. 12 (1941), pp. 91-96. 

[2] A. Waxp, “An extension of Wilks’ method for setting tolerance limits,’’ Ann. Math. 
Stat., Vol. 14 (1943), pp. 45-55. 

(3) H. Scuerré anp J. W. Tuxey, ‘Non-parametric estimation: I. Validation of order 
statistics,’’ Ann. Math. Stat., Vol. 16 (1945), pp. 187-192. 

[4] J. W. Tuxer, ““Nonparametric estimation: II. Statistically equivalent blocks and toler- 
ance regions—the continuous case,’’ Ann. Math. Stat., Vol. 18 (1947), pp. 529-539. 

[5] J. W. Tuxey, “Nonparametric estimation: III. Statistically equivalent blocks and 





SAM C. SAUNDERS 


tolerance regions—the discontinuous case,’’ Ann. Math. Stat., Vol. 19 (1948), 
pp. 30-39. 

(6) D. A. 8. Fraser ano R. Wormuercnron, ‘Nonparametric estimation: IV,” Ann. 
Math. Stat., Vol. 22 (1951), pp. 294-298. 

[7] D. A. 8. Fraser, “Sequentially determined statistically equivalent blocks,’’ Ann. 
Math. Stat., Vol. 22 (1951), pp. 372-381. 

{8} D. A. 8S. Fraser, “Nonparametric tolerance regions,’’ Ann. Math. Stat., Vol. 24 (1953), 
pp. 44-55. 

{9} D. A. 8S. Fraser anv Irwin Gurman, “Tolerance regions,’’ Ann. Math. Stat., Vol. 27 
(1956), pp. 162-179. 

{10} J. H. B. Kemperman, “‘Generalized tolerance limits,’’ Ann. Math. Stat., Vol. 27 (1956), 
pp. 180-186. 

{11] Mrrostav Jikina, ‘‘Sequential estimation of distribution-free tolerance limits,’’ 
Cehoslovack. Mat. Z. 2(77)(1952), pp. 211-232; correction 3(78) (1953), p. 283. 

{12} Davip BLackwe LL, ‘‘On a class of probability spaces,’’ Proceedings of the Third Berke- 
ley Symposium on Mathematical Statistics and Probability, Vol. II (1956), pp. 
1-6. 

113) Z. W. Brrnpaum anv H. Rusty, “On Distribution-Free Statistics,’’ Ann. Math. Stat., 
Vol. 25 (1954), pp. 593-598. 





A USEFUL GENERALIZATION OF THE STEIN 
TWO-SAMPLE PROCEDURE 


By R. WorMLEIGHTON 
University of Toronto 


1. Introduction. An experimenter wishes to estimate the mean, yu, of a normal 
distribution using a sample mean. If the variance, o’, is known, the size of a single 
sample can be functionally related to the precision of the sample estimate, e.g.: 
n' = (1.96)0/L, where n is the sample size and 2U is the length of a 95% con- 
fidence interval for u. The experimenter can choose in advance the point on the 
curve which provides a satisfactory balance between the cost of obtaining the 
sample and the precision of the final estimate. 

If the variance is not known, and no reasonable estimate can be obtained, 
several simple procedures are presently available. 

(i) The experimenter takes as large a sample as he can afford. The estimate 
of the mean has maximum precision, but it may be more precise than he requires. 

(ii) He takes a preliminary sample to get an estimate of the variance. On the 
basis of this variance estimate, he decides on the size of a second sample. His 
estimate of » is the mean of the second sample and its precision is determined 
from the second sample by the usual single-sample procedure. This method is 
wasteful of the information in the first sample. 

(iii) He can use a Stein two-sample procedure. Here, he specifies the precision 
of his final estimate in advance, and the total number of observations becomes 
a random variable. This is often unattractive because the cost of the experiment 
is not pre-determined and may turn out to be excessive. 

The experimenter would like to take a first sample to get a variance estimate, 
then decide on the total number of observations and the precision of the estimate 
of the mean, and finally use all his data in making the estimate. This can very 
nearly be accomplished by the generalized Stein procedure described below. 


2. Procedure. We are given a normal population with unknown mean, yu, and 
unknown variance, o . Consider a first sample of no observations: x; , 22, «+ - 


9 To 
An estimate of o°, based on the first sample, is 


: no i 7 


m—- 1\T 


Corresponding to any particular value of s, we can plot the curve 


(a) 2 
n,(L) = S 


Received June 20, 1955. 





R. WORMLEIGHTON 


Tota/ Sample Size 


2 


L, i, 
Confidence Interva/ Half-Length, L 


Total sample size n vs. confidence interval half-length L 
Cut (a): Fixed sample size 
Cut (b): Modified Stein procedure 
Cut (c): Bounded sample size 


Fia. 1 


where n = total sample size, 2L = length of a (1 — a) confidence interval, and 


aod 


= (1 — a) point of a t-variate with (mp) — 1) degrees of freedom. We thus 
ais a family of such curves which do not intersect. (See Fig. 1.) 

(i) On each curve choose, in advance, a single point. This set of points consti- 
tute a “cut” across the family of curves. 

(ii) Now actually take the first sample and calculate s. This determines a 
particular curve of the family and, because of the cut, a unique point, (n*, L*), 
say. 

Remark: There appears to be no practical advantage and, in fact, a waste of 
information if n* < no. We therefore exclude cuts which permit this situation 
to occur. 

(iii) Take [n* — no] + 1 further observations, where [gq] denotes the largest 
integer strictly less than q. 

(iv) Calculate Z, the mean of all the observations. Then, 7 + L* isa (1 — a) 
confidence interval for yu. 

Remark: If the cut is such that n* is not an integer, the exact confidence co- 
efficient is slightly higher then (1 — a). The approximation can be avoided 
either by excluding cuts which might yield non-integral values of n*, or by giving 
the last observation a smaller weight in calculating the sample mean, #. See 
Section 5. 





STEIN TWO-SAMPLE PROCEDURE 219 


3. Proof. The proof is that given by Stein. Let U = (4 — u)/n*/s. For a 
given cut, the value of n* depends only on s which is statistically independent of 
£. Hence, the conditional distribution of U, given s, is N(0, o°/ 8°). 

Consider a variate T =y/s, where Y is N(0, 0°) and statistically independent 
of s. The conditional distribution of 7, given s, is also N(0, o’/s’). U and T are 
therefore identically distributed. But 7, and hencs U, has a ¢ distribution with 
(no — 1) degrees of freedom. 


(l—a)=Pr{- sU st} 
ts t'*’s ) 
r{ Vn* Saat Vn 


Pr{#—L* sus 2+ L*}. 


4. Possible cuts. 

(a) Fixed sample size: (Line (a) in Figure.) The cut is defined by n = No,a 
constant. (No 2 no). The length of the confidence interval is a random variable. 
If No = 1m, we have the usual single-sample procedure. If No > mo, the pro- 
cedure differs from the single-sample procedure in that the variance estimate is 
based on fewer degrees of freedom—(mo — 1) instead of (No — 1). The length 
of the confidence interval will thus have larger expectation and larger variance 
than one calculated from a single-sample. 

(b) A modified Stein method: (Line (b) in Figure.) A confidence interval 
length, 2L, , is preassigned. The cut is defined by 


n= % 0<Lslo 
L=kh mMson< @, 


In Stein’s exact, although not in his approximate procedure, a second sample 
of at least one observation is always required, and a weighted average is used to 
estimate yu. If s should turn out to be so small that only one additional observa- 
tion is required, the last observation is given an excessive weight so that the 
precision of the estimate of yu is actually reduced. This device ensures that, even 
in this situation, the pre-assigned confidence interval length, 2L» , is obtained. 
See Section 5. 

In our modification, defined by cut (b), we use an unweighted average. When 
s turns out to be so small that no further observations are required, we obtain a 
confidence interval shorter than we anticipated, and the method reduces to the 
usual single-sample procedure. 

(ce) Bounded sample size: (Cut (c) in Figure.) An experimenter would like to 
have a (1 — a) confidence interval of length 21, , but is not willing to take 
more than N, observations. He may then use the cut 


n = % 0<Ls kh 
L= lL, msnsN, 
= N, L2th,. 





220 R. WORMLEIGHTON 


Iie will then obtain the desired precision, or better, with the minimum number 
of observations, if this number is less than N, ; otherwise, he will take NV, observa- 
tions and settle for the precision he gets. 

(d) Delayed decision: Conceivably, one could define a cut by considering, in 
advance of the first sample, each possible value of s, and choosing a point on each 
of the curves, n,(L). It seems superfluous, however, to make a large number of 
decisions when only one will be implemented. A possible procedure would be to 
take the first sample and calculate s, and only then make the decision on the 
basis of the one resulting curve. It is absolutely necessary that the decision not 
be influenced by the first sample mean; otherwise, n* is not independent of Z and 
the proof in Section 3 is not valid. This condition could be ensured when the 
decision-maker does not himself collect or analyze the data, or even see them, by 
informing him only of the observed value of s. It can then be argued that the 
decision on sample size is identical with that which would have resulted from a 
cut completely defined in advance of the first sample. 

This procedure can be compared with one in which-the experimenter has an 
independent estimate of the variance available to him when he is planning the 
experiment. If he intends to use the preliminary variance estimate for calculating 
the length of his confidence interval, and not the variance of the single sample 
he plans, then he can predetermine both sample size and confidence inteval 
length as in the case where the true variance is known. In our procedure, the 
preliminary variance estimate is obtained from part of the sample, but it is 
nevertheless independent of the sample mean. 

Throughout the discussion, we have assumed that the confidence interval 
length, 2L, is given in absolute units, the same as those of the observations. 
Scientists and engineers frequently prefer to specify the error as a percentage of 
the mean, and in order to convert absolute error to an approximate percentage 
error an estimate of the mean is required. There is a temptation in a delayed- 
decision procedure to use the first-sample mean for this purpose. This clearly is 
not permissible because the decision would be influenced by the first-sample 
mean. 

(e) Minimum cost: (Suggested by referee.) If c:(n) is the cost of taking n 
observations and c,(L) is the cost of an interval of length 2L, then a cut can 
be defined by the minimization of the total cost, ¢(n) + e{t‘”s/+/n}, with 
respect to n, for given s. Since total cost is minimized for every s, the expected 
total cost is a minimum. 


5. Weighted mean. In the exact Stein procedure, a weighted mean of the 
observations provides an estimate of ». The weights depend on s, and must 
satisfy the conditions: 

(i) Sum of all weights = 1. 

(ii) First-sample observations are each given the same weight, a. 

(iii) Sum of squares of the weights = z/s” where z is a pre-assigned constant. 

It is also possible and desirable to require the condition. 

(iv) All weights are non-negative. 





STEIN TWO-SAMPLE PROCEDURE 


The total sample size, n, is determined by 


s° | 
n = max [E]+1.m +3}. 


Case 1. If s* turns out so small that z/s’ = (1/no) + «, « > 0, then only one 
additional observation is made, which is given the weight (1 — nea) to satisfy 
condition (i). Condition (iii) requires that noa” + (1 — noa)* = (1/ne) + «. The 
quadratic in a has two solutions, but one of them makes (1 — moa) negative 
and is inadmissible by (iv). The other solution is 


wcitths § 
~ w+i1\ 


1 — ma = aril! += Vi + eno(no +0} > 


mo +1 mm + 1 
Hence, the weighting is uniquely determined. The last observation is given an 
excessive weight, thereby reducing the precision to the pre-assigned value. 
Case 2. s*/z > no. Then, n = [s’/z] + 1. Let 


zat 


There is considerable freedom ia the choice of weights but we can require, for 
simplicity, that the first (n — 1) observations be given the same weight, a. 
Then the last observation receives weight, 1 — (n — 1)a. 

Under this restriction there are still two admissible solutions, one of which 
is 


1 
+1 


a 


1 ,;———_—_.| 
1 — 5 VIF enol + 1)p < 5 


. es I 
-ched CWI TTE 
n-—l 1 


: 
1-@- Den ot —y/ Soie TED: 


This gives the last observation a reduced weight. One can interpret this weighted 
average as a simple mean of all the observations except for a fraction of the last 
observation—in effect, making the total sample size a continuous variable. 


REFERENCE 


[1] Cuar.es Srern, ‘‘A two-sample test for a linear hypothesis whose power is independent 
of the variance,’’ Ann. Math. Siat., Vol. 16 (1945), pp. 243-258. 





NOTES 


SUMS OF SMALL POWERS OF INDEPENDENT RANDOM VARIABLES 


By J. M. SHaprro 
Ohio State University 


1. Introduction and summary. Let (x,,), k = 1, 2,---k,;nm = 1,2,--+ bea 
double sequence of infinitesimal random variables which are rowwise independent 
(i.e. lima. MaXicese, P(| tue | > €) = O for every « > O, and for each 
N, Zn, *** » Sne, are independent). Let S, = 2a: + +++ + Ine, — An where the 
A, are constants and let F(z) be the distribution function of S, . 

In a previous paper [3] the system of infinitesimal, rowwise independent 
random variables (| 2, |") was studied for r = 1. Specifically, let 


S, = | tua |" + +> + | tn, |" — Ba(r), 


where the B,(r) are suitably chosen constants. Let F(z) be the distribution 
function of Sj,. Necessary and sufficient conditions for F(z) to converge 
(n — ) to a distribution function F’(z) and for F’(z) to converge (r + ~) 
to a distribution function H(z) were given, together with the form that H(z) 
must take. 

In Section 2 of this paper we consider the system (| 2, |") for 0 <r < 1. 
Results similar to the above are found, replacing (r + ~) by (r —~ 0°). How- 
ever different assumptions must be made at certain points. Various remarks are 
made in this paper to show where the results here differ from [3]. In particular 
it is shown that, if F’(x) converges (r — 0*) to a distribution function H(z), 
then H(z) will be the distribution function of the sum of two independent 
random variables, one Poisson and the other Gaussian. Furthermore, while the 
Gaussian summand may or may not be degenerate, the Poisson summand will be 
nondegenerate in all but one special case. 


2. Small powers of random variables.' In the remainder of the paper we use 
the notation of [3]. 

TueroreM 1. Let lim,.. F(z) = F'(x) for 0 < r <1 and lim,.o+ F"(z) = 
H (x). Then H(x) is the distribution function of the sum of two independent random 
variables, one Gaussian and the other Poisson. 

We require the following lemma. 

Lemma 1. If we add to the hypothesis of Theorem 1 the condition that 
lim». F(z) = F(x), the conclusion of Theorem 1 holds. 

The proof of this lemma follows the same lines as Lemma 1 of [3] except that 


o., . /N(+")-—M(-”) =0, z<1 
N*(z) = \ ot) — MO), 0<2<1, 


Received February 25, 1959; revised September 16, 1959. 
! The proofs in this section are similar to those given in [3], and hence they are con- 
densed or omitted. 


222 





SUMS OF SMALL POWERS 223 


which implies that N(0*) and M(0~) are finite. Thus N*(z) is either identically 
0 or takes one jump at z = 1. In fact N*(z) is identically 0 if and only if 
N(0*) = M(0-) = 0; i.e. if and only if F(x) is Gaussian. 

Proof of Theorem 1. Take 0 < 8 < 1 and let yar = | 2nx |". Then 


| na I’ a | ne |” 


and, for r/s < 1, under the conditions of Theorem 1, the conditions of Lemma 1 
are satisfied with the system (z,,) replaced by (yn). 

Remark. As can be seen from the above, if F,(x) —> F(z) then, under the con- 
ditions of Theorem 1, H(z) is Gaussian if and only if F(z) is Gaussian. That is, 
the (nondegenerate) Poisson summand will be present except when F(z) is 
Gaussian. 

Lema 2. Jf lim... F.(z) = F(x) and if M(x) and N(x) are bounded , then, 
for suitably chosen constants B,(r), F(x) converges to a distribution function 
F(x) if and only if 


lim lim lim a > [ 2” d{Fu(z) — Pa(—2—)) 
= ([ x d\F(z) — Fu(—2-))) } = o; < @ 


(2.1) 

The proof of this lemma is similar to Lemma 2 of [3] and will be omitted. 
THEoreoM 2. /f Hine F,(z) = F(z) then a necessary and sufficient condition 

for limy.. F(z) = F’(x) and for lim,.+ F(x) = H(z) for suitably chosen 

constants B,(r) is that 


(2.2) M(x) and N(x) are bounded, (2.1) holds, and 


lim,.o+ 07 = (0*)*, a finite constant. 


Furthermore, H(x) is Gaussian if and only if F(x) is Gaussian; H(x — m) is 
nondegenerate Poisson if and only if F(x) is not Gaussian and o* = 0 where 
m is a constant; otherwise H(x) is the sum of two independent random variables, 
one Gaussian and the other Poisson. 

Proor. Necessity. It follows from the proof of Lemma 1 that M(x) and N(z) 
are bounded and by Lemma 2 that (2.1) holds. We also see (Theorem 2 page 88 
of [1]) that if o* is the non-negative constant associated with the infinitely 
divisible distribution H(z) that 


e*0 r+0* inf 


(23) lim lim ™ ri [ u’ dM"(u) + 02 + [ wf an'(u) } = (¢*)’, 


? This could be replaced by a weaker condition; however, this condition appears in the 
proof of Lemma 1 and will appear as a necessary and sufficient condition in Theorem 2. 





224 J. M. SHAPIRO 
Now since M’(u) = O and N’(u) = N(u'") — M(—u'”) we see that 
0 € € 

/ u’ dM"(u) + [ u'dN’(u) Ss é | d{N(u'’") — M(—u'"")], 

—€ 0 0 
and, since M(u) and N(u) are bounded, we see that (2.2) holds. 

Sufficiency. By Lemma 2 we have lim,.. /,(z) = F’(x). Also, analogously 

to [3], lim.o+ M(x) = 0 = M*(z) and 


+ 0, sg Se% 
=\Nn(0t) — MO), O<2z<1. 


Furthermore f°, 2° dM*(x) + fi 2° dN*(x) < @ and since 


lim,.o+ N’(z) = N*(z 


0 


zx’ dM(z) +/ x aT (w) = 0 
—€ 0 


. ( 
lim lim “"P ¢ 
«+0 rot Inf | 

(as in the necessity proof) we see that (2.3) holds. Now if we replace r— ~ 
by r — 0°, the remainder of the proof is the same as that of Theorem 2 of [3]. 

Remark. In [3] the conditions imposed on M(x) and N(x), (and hence on 
F(x)) required F(x) to have moments of all orders (c.f. [2] and [1] page 83). 
In the present paper our conditions on M(z) and N(z) are different and in 
particular do not require F(z) to have any moments. 

Tueorem 3. If lim... F(z) = F(x), M(x) and N(z) are bounded, and if 
for some « > 0 


kn 
(2.4) Sf l2laPa(e) 
kel “\2z\<e 

is bounded in n for any fixed s > 0, then, for suitably chosen constants B,(r), 
F’.(2) converges (n — ©) to a distribution function F’(z) and F"(xz) converges 
(r— 0°) to the Poisson distribution. 

Proor: We first show that (2.4) implies (2.1) with ¢, = 0. This follows since 
for any « > 0, r fixed and s < 2r we have 


kn e 1 kn 
a (/ 2” d\Fa(x) — Fus(—2-))) T pe > | | x | dF yx (z). 
0 jzise 


k=l k=l 


Now using Theorem 2, since o* = 0, we see that (by proper choice of B,(r)) H(z) 
is a Poisson distribution (possibly degenerate). 


REFERENCES 


[1] B. V. Gnepenko anp A. N. Kotmogorov, Limit Distributions for Sums of Independent 
Random Variables, translation by K. L. Chung, Addison-Wesley, 1954. 

[2] J. M. Saaprro, ‘‘A condition for existence of moments of infinitely divisible distribu- 
tions,’’ Canadian J. of Math., Vol. 8 (1956), pp. 69-71. 

[3] J. M. Suaprro, ‘‘Sums of powers of independent random variables,’’ Ann. Math. Siat., 
Vol. 29 (1958), pp. 515-522. 





DISTRIBUTION OF EXCEEDANCES 225 


ON THE MEDIAN OF THE DISTRIBUTION OF EXCEEDANCES 
By K. SarKapi 


Mathematical Institute of the Hungarian Academy of Sciences, 
Budapest, Hungary 


The distribution of exceedances may be defined by the following formula (see, 
e.g. [2]), where x corresponds to the number of exceedances 


taabafataim orig Pee 
m—™ m— 1 
w(nm, Mm, M, 2) = - - —— ’ 
(1) Cs 

m 


z=0,1,°+++,m;m,m,mson 





given natural numbers. 
There are known—among others—the following two fundamentally equivalent 
models or representations of this distribution [2], [3], [4]: 

A. Exceedances. We have two random samples of sizes n, and n, , respectively, 
from the same continuous distribution. The number of exceedances is defined 
as the number of elements of the second sample which surpass at least 
n, — m+ 1 elements of the first, for a fixed natural number m S n,. The dis- 
tribution of the number of exceedances is given by formula (1). 

B. Pascal model without replacement. An urn contains n, black and n, red balls. 
We draw balls from the urn until we have drawn m black balls. The distribution 
of the number of the red balls drawn is given by (1). 

In [1], Gumbel proved that, for n, = nm, the median of the number of ex- 
ceedances is m — 1, more precisely that 


(2) W(nymnjm — 1) = 3, 


where W is the cumulated form of w, 


= 
W(m,mm,2z) = > win, m, M2, Y) 
y=~9 


In this pap~r a simple proof of this result is given. 
We shall, in fact, prove the following more general result: 


(3) Wim, m,, M2, M2 — 1) + Wine, m,m,m — 1) = 1. 


In terms of model A, the mth element of the first sample exceeds the meth 
element of the second sample if and only if the number of exceedances (y) takes 
one of the values 0, 1, --- , m: — 1, these possibilities being mutually exclusive. 


Received May 20, 1959; revised August 10, 1959 





226 K. SARKADI 


Therefore, the first term of the left side in (3) denotes the probability that the 
mth element of the first sample (from above) is larger than the mth element 
of the second one; the second term denotes the probability of the opposite in- 
equality. Since either the inequality or its opposite must hold, equation (3) is 
proved. 

This proof can also be formulated with the notions of the Pascal model (B). 


The author is indebted to Prof. E. J. Gumbel for drawing his attention to 
this problem. 


REFERENCES 


{1] E. J. Gumpet anp H. Von Scuexurna, ‘The distribution of the number of exceed- 
ances,’’ Ann. Math. Stat., Vol. 21 (1950), pp. 247-262. 
(2) E. J. Gumpe., Statistics of Extremes, Columbia University Press, New York, 1958. 
(3] K. Sarxapr, ‘‘On the distribution of the number of exceedances,’’ Ann. Math. Stat., 
Vol. 28 (1957), pp. 1021-1023. 
[4] KArory Sarxapr, ‘‘Generalized hypergeometric distributions,’’ Publ. Math. Inst. Hung. 
Ac. Sc., Vol. II (1957), pp. 59-69. 





CORRECTION NOTES 


CORRECTION TO 
“GENERALIZATIONS OF A GAUSSIAN THEOREM” 


By Paut 8. Dwyer 

University of Michigan 
The following correction should be made to the paper cited in the title (Ann. 
Math. Stat., Vol. 29 (1958), pp. 106-117). The letters e and ¢ appear interchange- 
ably in sections 8 and 9. The values they represent are really the values of « 
with 6 = 6*. Accordingly it would be much better if the « at the beginning of 
the second sentence of section 8 on page 113 were replaced by e = A@* — z, 


and each remaining « in section 8 and section 9 were changed to e. I am in- 
debted to M. M. Rao who called this to my attention. 


—— 


CORRECTION TO AND COMMENT ON 
“EQUALITY OF MORE THAN TWO VARIANCES AND OF MORE THAN 
TWO DISPERSION MATRICES AGAINST CERTAIN ALTERNATIVES” 


By R. GNANADESIKAN! 


The Procter & Gamble Co. 


This note is motivated by a desire to clarify certain points in my paper [1]. 
In Section 4 of [1], the region of acceptance, (4.3), of a test for the null hypothesis 
Ho:2; = 22 = -+- = Ye = Lois in error. The central result, which should have 
been emphasized, was (5.5) of [1] which, of course, is an exact probability state- 
ment with preassigned probability 1 — a. Starting from (5.5), however, one 
obtains as the implied acceptance region for H» not (4.3), but the following 
intersection region: 


(A) Cmax(Si) 5 and Cmin('Si) < j = 1,2,---,k, 


Cmin(So) Cmax (So) 


where 
Cmin (Sj) Cmax(S;) 
Cmax(So) Cmin(So) : 


Since (A) is obtained by implication from (5.5) of [1], it is, of course, true that 
this acceptance region will have a probability under the null hypothesis of at 


Aa < Aye and 


' Now with the Bell Telephone Laboratories, Inc., Murray Hill, New Jersey. 
227 





228 CORRECTION NOTES 


least 1 — a, and the phrase “size (1 — a)” following (4.4) of [1] is not meant 
to imply that the test proposed is a similar region test. Also, starting again from 
(5.5) of [1], the implied simultaneous confidence statements (5.10) of [1], with 
a confidence coefficient 21 — a, were obtained, and the main objective of [1] 
was to obtain such confidence statements, while the test for H» was only of 
secondary interest. 

A question raised by T. W. Anderson, and which, in fact, was originally in- 
vestigated but temporarily abandoned by me, is whether it would not be more 
desirable to consider a test with the following intersection region of acceptance 


for Hy : 


(B) vj1 S Cmin(SpSo') S Cmax(SjSo') S v2, j=1,2,---,k, 


where v;, and v;;, for 7 = 1, 2, --- , k, are to be chosen such that this region 
is of size 1 — o under H,. This is the natural extension of the test proposed by 
Roy [2] for the case k = 1, and it formed the starting point of my original in- 
vestigation that led to [1]. While (B) is preferable to (A) as a test of Ho against 
certain types of alternatives, because the size of (B) does not depend on the 
characteristic roots of 2; = 22 = --- = Ly = Lo, yet, for k > 1, the distri- 
bution problem associated with it seemed intractable and, furthermore, my 
initial attempts to obtain simultaneous confidence statements associated with 
(B) were not successful. These points are now being more fully investigated. 
Finally, the test with acceptance region (A) may, against certain alternative 
hypotheses, be preferable to (B), although even here (A) itself may not be the 
best possible. This last point is to be more fully developed in a joint paper by 
S. N. Roy and myself. 

I thank T. W. Anderson, who kindly pointed out the need for clarification, 
and S. N. Roy for his comments and suggestions. 


REFERENCES 
[1] R. GNANADESIKAN, ‘‘Equality of more than two variances and of more than two dis- 
persion matrices against certain alternatives,’’ Ann. Math. Stat., Vol. 30 (1959), 
pp. 177-184. 
{2} S. N. Roy, ‘On a heuristic method of test construction and its uses in multivariate 
analysis,’’ Ann. Math. Stat., Vol. 24 (1953), pp. 220-238 


or 


CORRECTION TO 
“THE USE OF SAMPLE QUASI-RANGES IN ESTIMATING 
POPULATION STANDARD DEVIATION” 


By H. Leon Harter 
Wright Air Development Division 


In the paper cited in the title (Ann. Math Stat., Vol. 30 (1959), pp. 980-999), 
on p. 988, the numerator and the denominator of (2) should be interchanged. 
This error does not affect the tables or other portions of the text. 





CORRECTION NOTES 


CORRECTION TO 
“A MULTIVARIATE GAMMA-TYPE DISTRIBUTION” 


By A. S. KrisHNAMOORTHY AND M. PARTHASARATHY 


The authors are indebted to P. R. Krishnaiah and M. M. Rao for having 
kindly drawn attention to the following corrections in the paper referred to 
above (Ann. Math. Stat., Vol. 22 (1951), pp. 549-557). 

Page 551: In equation (2.3) and everywhere in what follows, p = 4m on the 
understanding that m is a positive integer. If p is any positive real number, the 
legitimacy of the “‘mgf” in question does not follow from what has been demon- 
strated. 

Page 554: Section 4 is incorrect and has to be omitted, since the convergence 
condition for (4.1), obtained by the authors, is necessary but not sufficient for 
(4.1) to be a frequency function. To see this, consider the special case n = 2, 
choosing p (as is permissible) so that p piz > min (p; , ps). Inverting this “mfg”, 
one gets a function which is not a probability density. This special case is, in 
fact, contained in the authors’ reference [4] mentioned at the end of their paper 





ABSTRACTS OF PAPERS 


(Abstract of a paper presented at the Cambridge, Massachusetts Meeting of the Institute, 
August 26-28, 1958.) 


31. Markov Renewal Processes. Ronatp Pyke, Columbia University. (In- 


vited Paper presented under the title, “On Multi-event Renewal Proc- 
esses.’’) 


Let Q = || Qi; ||, 1S i,7 S m,m < @ be a matrix of transition distributions, i.e. each 
Qi; is a mass function satisfying Q;;(¢) = 0 for t S$ 0 and 57.4Q.;(+) = 1. For discrete 
probabilities a, ,--- , am, let {(Jn, Xa); n 2 0} be a stochastic process satisfying 


PiJo= kl) =a, Xo=0, 


and PiJ. = k, Xn s z| Jo, Ji ’ Xi, wes. Dict s Xn] = Qin ke (2) a.s. For t 2 0, 
1 Sj S m, define N;(t) as the number of times J, = j and S, < t for n > 0, where 


Sa = Xi t +--+ + Xe. 


The vector process {N,(t),---, Na(t); ¢ 2 0} is called a Markov Renewal Proc- 
ess (M.R.P.). Alternatively, it is possible to define an M.R.P. as an equivalent 1-dimen- 
sional process. Set N(t) = Ni(t) + --- + Na(t), and define Z; = Jy iy . The process 


{Z.:t 2 O} 


is called a Semi-Markov Process (S.-M.P.). An M.R.P. is an S.-M.P. (with a finite state 
space) if and only if Qj, = 0 for all ¢. Let Pi;(t) = P[Z; = 7 | Z = i] and 


Gi; (t) = P(N; (t) > 0 | Zo = i], 


the latter being the first passage-time distribution from state i to state 7. Relationships 
between the Q;; , Pi; and G,; are derived. These can be solved to obtain expressions for the 
P,; and G;; in terms of the Q;; . For example, one may show |] P;; || = (J — Q)U — 3X) 
where the elements of 3C are given by Hi; = 8:;D%-1 Que . M.R.P.’s are generalizations 
both of discrete and continuous parameter Markov Chains. They have many applications, 
the one which motivated the author’s definition and study of these processes being to the 
theory of multiple channel electronic counters. 


(Abstracts of papers presented at the Washington, D. C., Annual Meeting of the Institute, 
December 27-30, 1959.) 


24. Main-Effect Designs for Asymmetrical Factorial Experiments. Sipnry 
ADDELMAN, Iowa State University. 


A method of constructing orthogonal designs which allow the estimation of main effects 
for a general class of asymmetrical factorial experiments is presented. By the use of the 
suggested method of construction, it is possible to obtain a der‘gn in which all main effects 
are preserved, for the st xX sy X ++ K se experiment in 8; observations, where s; is a 
prime or a power of a prime, s; > 8 > --- > s,, and S t; = (sf — 1)/(s, — 1). As an 
interesting consequence of the above method of construction, one is able to obtain main- 
effect designs for symmetrical factorial experiments in which the number of levels of each 
factor is not a prime or a power of a prime. 


230 





ABSTRACTS 231 


25. A Probability Model for Theory of Organization of Groups with Multi- 
Valued Relations Between Persons. Joun L. Baca, Florida State Uni- 
versity. 


We are concerned with relations between ordered pairs of distinct individuals in a finite 
group of n individuals. Let there be (k + 1) distinct types of relations where the (k + 1)st 
relation usually denotes the null relation. The relations between ordered pairs (i, j),i,j = 1, 
2,---,n,t #7 of individuals are represented by an n X n X k matrix C with elements 
Ciju = Oor 1l,u = 1,2,---,k. We let ru = Djs ciys represent the total number of choices 
made by individual i and sj, = Dis cij. represent total number of choices received by in- 
dividual j with respect to the uth relation where c;;, = 1 if individual i chooses individual 
j, t # j, otherwise c;;, = 0. ci, = O for all i and u. We insist that one and only one of the 
(k + 1) relations exist between each ordered pair (i,j), i.e., Dri cigs S 1. We let r* = 
(rin, *** » Taw) and s* = (8) ,-** , Sav) be the marginal row and column total vectors for 
the n X n submatrix of c for the uth relation. We let (rn, s) = (r’, 8’, ---, r*, a*). The 
main theorem gives a procedure for counting the exact number of matrices C for any 
given fixed 2nk dimensional vector (rx, 8) subject to the previous restrictions on the ele- 


ments cij, . This is an extension of a result obtained by Katz and Powell (Proc. Amer. 
Math. Soc., Vol. 5, 1954). 


26. Multiple-Decision Ranking Problems Arising from Factorial Experiments 
on Variances of Normal Populations (Preliminary report). Roserr E. 
Becuuorer, Cornell University. 


A multiplicative model is considered as a basis for analyzing multifactor experiments 
which are conducted to study the effect of changes in the levels of the factors on the vari- 
ance of a normally distributed chance variable. A single-sample multiple-decision pro- 
cedure for ranking the treatment ‘‘effects’’ on the variance when the experiment is con- 
ducted in blocks (and the block “‘effects’’ are thus removed) is proposed. The procedure 
is a generalization of the one described in these Annals, Vol. 25, pp. 273-289. Similar pro- 
cedures can be used in multifactor experiments for ranking simultaneously the “‘effects’’ 
of two or more factors. Tables of the type given in the above reference are being prepared. 
Some of these tables can also be used for testing hypotheses concerning the “‘effects’’ or 
for forming interval estimates of the “‘effects.” 


27. A “Renewal” Limit Theorem for General Stochastic Processes. V. E. 
Bene&, Bell Telephone Laboratories and Dartmouth College. 


Let z, be a stochastic: process on a space X, and let |z,¢ Aj be a measurable set, A C X. 
Let t, ,n = 0,241, +2, --- be a real discrete-parameter process on the same measure space, 
with tau: > t, a.s. and H(z) — H(y) = expected number of t, ¢ (y, z) < @. Priz,e A} 
can always be written as fi. Ka (t,u) dH (u). The event {z, ¢ A} is called weakly stationary 
w.r. to {t,| if its representative kernel Ky, is a difference kernel, K,(t, u) = Ka(t — u). 
Theorem: Let y: be the time from t to the next t, , i-e., y: = min{t, — t)4, >t. Ifly,< @] 
and {z,¢ A} are both weakly stationary w.r. to {t,| with respective L, kernels Y and Ky, , 
if the {t,} are “‘aperiodic’’ in the sense that the Fourier transform of Y does not vanish, 
and if H(- + 1) — H(-) is bounded, then lim Pr{z,e A} = | K, || /'} Y |) (i, norm) as 


t— o, 





232 ABSTRACTS 


28. Use of Prior Knowledge in Finding the Maximum Response. R. J. Burner, 
Iowa State University. 


In seeking the value of a vector of control variabies z which maximize an expected yield, 
Ey = f(x), the choices of z for the initial observations must of necessity depend on a sub- 
jective judgement based on prior knowledge. The following problems are considered: (1) 
Under what prior assumptions does the ‘‘path of steepest ascent’’ have optimal properties? 
(2) What are the properties of some other paths, for example those determined by choosing 
the nth vector z, to maximize the conditional expectation of the nth yield y, given the 
first n — 1 observations. 


29. A Subfield Containing a Sufficient Subfield is Not Necessarily Sufficient. 
D. L. Burkuouper, University of Illinois. 


Let X be Euclidean two-space, S be the sigma-field of Borel sets of X, and P be the 
class of all probability measures p on S of the form p = q X q where q is a probability 
measure on the sigma-field of Borel sets of the real line. Let 


to(z) = (min{z, , 22}, max{x; , 22} ) 


ifz = (zm , 22) is in X. Let (xz) = zif zisin B, = t)(z) if zisin X — B, where Bisa 
subset of X not in S such that 2 > z, for each z in B. Let S; be the subfield of S induced 
by the statistic ¢; for i = 0, 1. Then t is a sufficient statistic and S» is a sufficient subfield 
for the measures ? on S. However, t; and S; are not sufficient. This is in spite of the fact 
that 4) = F(t) for some function F and S, C 8S, . This example provides a negative answer 
to a question posed by Bahadur on page 441 of his paper “Sufficiency and statistical de- 
cision functions,” Ann. Math. Stat., Vol. 25 (1954), pp. 423-462. 


30. Optimum Properties and Admissibility of Sequential Tests. D. L. Burx- 
HOLDER AND R. A. Wissman, University of Illinois. 


Suppose X, , X2 , --- are independent and identically distributed, with common density 
Po Or p; , and it is desired to test one possibility against the other. In the following, i = 0, 1. 
If S is a sequential test, let a;(S) denote the error probabilities, »;(S) = Eyc.(N), where 
N is the sample size, 0 = ¢;(0) S e:(1) S --- < eg(e) = ©, and eg(n) ~ ~ asn— «. 
S will be called inadmissible if there is an S* such that a;(S*) S a;(S), »;(S*) S »(S), 
with strict inequality in at least one of the four. S* is said to have optimum property 
I (OP) if »:(S*) < «©, and »;(S) 2 »(S*) for each S satisfying »;(S) < » and 


ai(S) S a;(S*). 


S* has OP ;, if »(S) 2 »:(S*) for each S satisfying a;(S) S a;(S*). Theorem 1. If S* has 
OP, then it has OP,,; . For c;(n) = n, Wald and Wolfowitz have shown that the Wald 
SPRT with barriers 0 < B <1 <A < @ satisfies OP; . Hence, by Theorem 1, it must 
satisfy OP ;; . In the next theorem it is assumed that c;(n) = n. Theorem 2. If S is a SPRT 
with either B < A < 1lorl < B < A, then S is inadmissible. S can be improved upon by 
a mixture of at most 3 tests, one of which does not take any observations, such that the 
mixture is not only admissible but possesses OP; and therefore also OP ;; . 


31. Conditional Expectations of Banach-Valued Random Variables. 8. D. 
Cuatrersi, Michigan State University. (Introduced by K. J. Arnold.) 


The notion of conditional expectation of Banach-valued random variables has been 
introduced and a study of martingales of such random variables has been made. Three 





ABSTRACTS 233 


different cases arise, depending upon the topology used: the strong, weak and weak star if 
the Banach space of values is the dual space of another Banach space. The corresponding 
notions of integration used are Bochner, Pettis and Gelfand respectively. Owing to the 
non-existence of theorems of Radon-Nikodym type for Banach-valued measures, separate 
proofs for the existence of conditional expectations had to be given. The theory simplifies 
in the strong topology and the usual properties of conditional expectations are valid in this 
case. Convergence of martingales of the type X, = E(Z | 5,)n 2 land X_, = E(Z| S-_.), 
both almost everywhere and in L, are proved independently of the classical theory. Con- 
vergence in L, has also been proved for above martingales when & is reflexive by extending 
a method due to Jerison (Proc. Amer. Math. Soc. Vol. 10, 1959) using mean ergodic theorems. 
For doing this, weak completeness and compactness properties of spaces L,(Q, @, P, X) = 
{X(w): X(w) strongly measurable, fi X(w) |? dP < } have been studied. The results 
are used to prove the strong law of large numbers for Banach-valued random variables and 
the theory of derivatives in Banach spaces. 


32. Certain Extensions of a Theorem of Marcinkiewicz (Preliminary Report). 
Ince CurIsTENSEN, Catholic University. 


This paper considers the function f(t) = Ka/fi(t)e.[P..(t)], where K, is a constant and 
where P,,(t) is a polynomial of degree in and with complex coefficients a, + if, , and where 
the iterated exponentials e,(z) are defined as follows: e,(z) = exp(z), e:(z) = exple:(z)), 

+, ex(z) = exples_s(z)]. Using the analytic approach of E. Lukacs (Pacific J. Math., 
Vol. 8 (1958), pp. 487-501), it has been shown that if m > 2, then f(t) cannot be a charac- 
teristic function in the following cases: (i) f,(t) = exp[y: (e“ — 1) + y:(e~* — 1)); GD fi) = 
expig() — 1] where g(t) is an entire characteristic function belonging to a lattice dis- 
tribution with the origin as a lattice point; (iii) f,(t) is the characteristic function of a 
bionomial distribution. In the case where n = 1 and /fi(t) is the characteristic function 
of a gamma distribution, it has been shown that f(t) cannot be characteristic function if 
m > 3 or if m = 3 and 3; is zero or negative. 


33. Minimax Sequential Tests of Some Composite Hypotheses. Mornis H. 
DeGroot, Carnegie Institute of Technology. 


Let {|X (t); t 2 0} be a Wiener process with unknown mean g per unit time and known 
variance per unit time. The problem is to test the hypotheses Hy : w S wo and H;:u > wo, 
where yo is a given constant. Let the cost of accepting an incorrect hypothesis when 4g is 
the true mean be of the form c | u — wo |’, where c > Oand0 <r S 2. Let the cost of ob- 
serving the process for a time 7’ be b7', where b > 0. Under these conditions it is shown 
that the minimax test is a specific sequential probability ratio test; i.e., a test under which 
the process is observed as long as h; + st < X(t) < h. + at for appropriate constants 
hi , he , and s. The analogous problem of testing composite hypotheses about the mean of 
a normal distribution is considered and it is shown that if the cost per observation is large, 
the minimax test is to take exactly one observation and then accept one of the hypotheses. 


34. Small Sample Behavior of Estimators of Parameters in a Linear Func- 
tional Relationship. Martin Dorrr anp Jonn Gurianp, Iowa State 
University. 

Housner and Brennan (1948) and Durbin (1954) have proposed a very simple consistent 
estimator of the slope in a linear functional relationship between two variables subject to 





234 ABSTRACTS 


error: b = (Zwiy;s)/(Zwix;) (Zw; = 0), where the weights w; are very simply related to the 
serial order of the observations; that is, w; = i — 7. If one knew that the true values X; 
corresponding to the observed values z; were uniformly spaced, this would clearly be a 
desirable estimator; in fact, it is precisely the usual least-squares estimator. The question 
arises, how does this estimator behave when the X; are not uniformly spaced. It is possible 
to obtain the bias and mean square error of this estimator for various error distributions 
without undue difficulty if it is assumed that ordering the points according to the z; is the 
same as ordering the points according to the X; . It is shown that the bias and mean square 
error are surprisingly insensitive to even wide-spread departures from uniform spacing, 
and in particular, that the bias is much less than that obtained when using the ordinary 
least-squares estimator. 


35. On the Distribution of a Noncircular Serial Correlation Coefficient with 


Lag 1 When the Mean of the Observations is Unknown. FriepHELm 
Ercxer, University of North Carolina. 


In the theory of time series several serial correlation coefficients have been used for 
testing the independence between the observations z,; . In this paper the z; are considered 
to be distributed like N (m, 1) where m is unknown. As a suitable noncircular serial correla- 
tion coefficient with lag 1 for the test of independence is considered r = g/p with 
q= Dis (ai — 2) (tia — 2), p = Dhi (x — B*, where = 1/n DLs x and nis the sample 
size. So far not much seems to be known about the distribution of r. In thia paper its first 
cumulants are derived. This is done by starting from a divisor of the characteristic poly- 
nomial of the matrix of the quadratic form in the numerator. Thereby use is made of a 
symmetry in its characteristic vectors and of the relations between power sums and com- 
plete symmetric functions. Some results of Siddiquis work on noncircular coefficients for 
known mean m are utilized here. As is to be expected our results do not differ very much 
from his. Besides these exact results, bounds are found for all cumulants. The method used 
here is related to perturbation theory and another theory developed mainly by Schaefke 
for characteristic value problems with two parameters. 


36. Partnership Games with Secret Conventions Prohibited. Martin Fox anp 
Herman Rustin, Michigan State University. 


The ethics of bridge prohibit the use of secret signals by any partnership. This is ex- 
plicitly stated in Law 5 of ‘‘The Laws of Duplicate Contract Bridge’’ (Ely Culbertson, 
Bidding and Play in Duplicate Contract Bridge, John C. Winston, Philadelphia, 1946, pp. 
223-224.) Two game-theoretical formulizations of this rule are: 1. Whenever an agent of 
either player is required to make a bid or to play a card as defender he must announce his 
behavioral strategy as weli as the bid or play which results from the randomization required 
by the behavioral strategy. 2. Instead of announcing the behavioral strategy to all other 
agents, the agent who is moving announces it to a referee. The referee announces to each 
of the other agents their a posteriori probabilities of each distribution of the cards unseen 
by them given the previous sequence of bids. It is shown that with rule 1. bridge has a 
value. Furthermore, each player has a good strategy in which the behavioral strategies at 
each move depend only on the a posteriori probabilities. 


37. A Simplified Method for Finding Confidence Limits on the Relative Risk in 
2 X 2 Tables. Joun J. Gant, Johns Hopkins University. (By title) 


Consider a 2 X 2 table with a total of m positives of which z are from the first sample 
of size mn, and m — z are from the second sample of size n; ; the designation of the samples 





ABSTRACTS 235 


and the positives being defined by the relations;m S nm: + nm. — mandz/nm S (m — z)/ms. 
Cornfield (Proc. of Third Berk. Symp., IV, pp. 135-148) and Cox (J. R. 8. 8. (B), Vol. 2 
(1958), pp. 215-238) have proposed methods for finding approximate confidence limits on 
the relative risk, namely: ¥ = piqg2/p2q: , where p, and p; are the population proportions. 
The method preposed here involves an approximation to the sum of the hypergeometric 
probabilities similar to the first term approximation of Wise (Biometrika, Vol. 41 (1954), 
pp. 317-329). It yields the lower limit, 


vi = (2n, — (m — 2))/(2n, — z + 1)2/(m — x + 1)1/(Fi-.[2(m — z + 1), 2z)), 
and the upper limit, 
v2 = (2m — (m — xz) + 1)/(2m — 2) (z + 1)/(m — 2)Pi_enl2(z + 1), 2(m — 2)], 


where the approximate confidence coefficient is 1 — a. Several examples have shown that 
this method yields confidence coefficients which are comparable to those found using the 
previously proposed methods. 


38. A Single Sample Decision Procedure for Selecting a Subset Containing the 
Best of Several Normal Populations and Some Extensions. 8. 8. Gupra, 
Bell Telephone Laboratories. 


Let 2; , denote the sample mean and sample variance based on n; observations from a 
normal population I; with mean yw; and a common variance ¢* (4; and ¢* unknown). A single 
sample decision procedure for selecting a non-empty, small subset of the k populations 
such that the probability that the population with the largest mean is included in the 
selected subset is at least equal to a pre-assigned value P* (regardless of the true unknown 
values of the parameters) is given. The procedure is ‘Select the population 0, if and only 
if %; 2 max (2, 22, --- , Bin, Bisr, **- , 2x) — c8/(ms)” where #* is the usual pooled 
estimate of o* and c is determined to satisfy the required probability condition. Expres- 
sions for the probability of a correct selection are derived and in the case of common num- 
ber of observations, the constants c’s are shown to be the percentage points of a certain 
statistic. The case of unequal but known variances o; is also treated. Formulae are ob- 
tained for the expected number of populations retained in the selected subset and, for se- 
lected cases, tables are given for the expected proportion of populations retained. The 
latter tables can be used to determine the common number of observations required to 
control the expected size (or proportion) of the retained subset when the best population 
has a certain “‘distance’’ from the others. Some extensions of the procedure to other para- 
metric cases are given. 


39. On the Distribution of the Ratio of the Smallest of Several Chi-Squares 


to an Independent Chi-Square. 8S. 8. Gupra anp M. Sopnet, Bell Tele- 
phone Laboratories. (By title) 


This paper deals with the problem of finding lower percentage points of the distribution 
of y = xmin/xo Where xmis is the smallest of p independent chi-squares and x} is a chi-square 
independent of the p others. The case of a common even number of degrees of freedom for 
all p + 1 chi-squares is the principal case considered and the only case for which computa- 
tion was carried out. Tables give the 25%, 10%, 5%, and 1% points for common » = 2(2)50 
and p = 1(1)10; the case p = 1 which reduces to an F-distribution was used as a check. 
Relationships to the distribution of xdax/xo are considered. The tables computed have im- 
mediate application to the problem of selecting a subset of k(= p + 1) normal populations, 





236 ABSTRACTS 


based on a common number of observations from each, which contains the population with 
the smallest variance with any pre-assigned probability. 


40. On a Single Sample Procedure for Selecting from Several Normal Popu- 


lations a Subset Containing the Population with the Smallest Variance. 
8S. 8. Gupra anp M. Soset. 


A procedure is studied for selecting a subset of several given normal populations which 
includes the population with the smallest variance. For given numbers of observations 
from each of k normal populations, the procedure R selects a subset which is small (the 
exact size depends on the observed results), never empty and yet large enough to guarantee 
with preassigned probability that, regardless of the true unknown values of the variances, 
it will include the population with the smallest variance. If sj based on »; degrees of free- 
dom denotes the sample variance from the population I; then the procedure is ‘Select 1; 
if and only if cst < min (of oh -+ ahs , ist, *** , 8)” where the constant c(0 < c < 1) 
is determined so as to satisfy the required probability condition. Expressions are derived 
for the probability of a correct selection, and for the expected number of populations in- 
cluded in the selected subset. The relationship to the problem of selecting the population 
with the largest variance is discussed. 


41. Almost Linearly-Optimum Combination of Correlated Unbiased Estimates 
by Regression Methods. Max Hatrerin, Knolls Atomic Power Labora- 
tory. 


Suppose one has available a multi-normal sample (yi; , yo , «-* , Yes), 7 = 1,2, -°+ , 0, 
with mean vector sj (wu a scalar, j a unit row vector) and arbitrary covariance matrix 2. 
The coefficients of the minimum variance linear unbiased estimate (MVLUE) of u will, of 
course, involve the (unknown) elements of the covariance matrix. We can estimate these 
coefficients and still have an unbiased estimate of » with, however, an unknown distribution 
almost certainly involving nuisance parameters. Transform the sample into (yi , da, 
«++ ,des),i = 1,2,--- ,n, where d;; = yx; — ys and consider the distribution of yu , --- , yin 
given (dx; ,--- , dex), i = 1,2, --- , n. Defining opt as the variance of the MVLUE based 
on a single observation, (y: , --- , ye), one finds that, conditionally, y,; is normal with 
variance o3 and expected value, 6 + 53 8,d;; . It follows immediately that the regression 
estimate of u is given by ~ = §: — D2 4); , where the carets denote maximum likelihood 
estimates, and that Var gs = (orpe/n){1 + T?/(n — 1)} where T? is Hotelling’s T? statistic 
with (k — 1) and (n — k + 1) degrees of freedom for the vectors (dx; , --- , das), i = 1, 2, 

- , n. This variance is identical to the variance of the MVLUE except for the factor 
T*/(n — 1), which is trivial for n at all large. An estimate of 0%, with (mn — k) df. is avail- 
able from the sum of squares of deviation from regression, so that exact confidence inter- 
vals for uw are available. Note that these results apply also to k independent samples of 
equal size, with no intrinsic pairing from sample to sample, by the introduction of ran- 
domization. 


42. Certain Uncorrelated Statistics. Ropert V. Hoaa, University of Iowa. 


Let X, , X:, --: , X, be a random sample from a distribution symmetric about @. Let 
T = 1(X,, X:, --- , Xn) be a statistic such that F(T) = 0, T(X. +h, ---, Xn + hjp= 
T(Xi, «++, Xn) + h, and T(-X.,---, —Xa) = —T(Ki, +--+, Xa). Let S = 
S(X,, +--+ , Xa) be a statistic such that S(X; +h, --- ,Xa +h) = S(X1,-++ , Xn) and 
S(—X,,--- , —-X,) = S(X,, --- , X,). If the correlation coefficient of T and S exists, it 
is equal to zero. 





ABSTRACTS 237 


43. Further Results on Hypothesis of No Interaction in Multidimensional Table 
(Preliminary report). P. R. Krisanatan, University of Minnesota anp 
V. K. Murrnuy, University of North Carolina. 


In a contingency table, any dimension is defined as a factor or response according as its 
marginal totals are fixed or random. Roy and Kastenbaum (these Annals, 1956) discussed 
the hypothesis of no interaction in a three way table when all dimensions are responses. 
In this paper, the extension of the above results to multiway table are discussed when some 
dimensions are factors and the rest are responses. 


44. Remarks on “Standard Coefficients” in Normal Regression Analysis. 
P. R. Kriswnaian ann M. M. Rao, University of Minnesota. (By title) 


In some applications of the normal regression analysis, for computational reasons, the 
so-called ‘standard partial regression (or beta) coefficients” are in use. For estimation of 
the usual multiple regression coefficients the use of either this procedure or the direct cal - 
culation is immaterial. But the fact that the “‘beta’’ coefficients are not normally distributed 
and generally the usual test procedures (the “Student’s’’ ¢ for testing the regression co- 
efficients to have specified values not necessarily zero, and the confidence bounds obtained 
therefrom) are not valid for these standard coefficients, is overlooked. The only valid one 
in the “‘new procedure’”’ is the over-all test for the hypothesis of no regression. The correct 
procedure and the distribution of the beta’s (in series form) are indicated in this note. 


45. On Characterization Problems Connected with Quadratic Regression. 
R. G. Lana anp E. Luxacs, Catholic University. 


Let X and Y be two random variables. Then Y is said to have polynomial regression of 
order p on X, if the conditional expectation of Y given X is a polynomial of degree p in X. 
In particular, if p = 2, Y is said to have quadratic regression on X. Let X, , X:, +--+ , Xa 
be n independently and identically distributed random variables with a common distribu- 
tion function having a finite variance. Let A = X; + X: + --- + X, be the sum and 
Q = Q(X, , X:, «++ , Xn) be a quadratic polynomial statistic. In the present paper all the 
distribution functions which have the property that Q has quadratic regression on A are 
investigated in detail. It is also proved that in each case the distribution function is 
uniquely determined by this property. These results contain as special cases the earlier 
investigation of M. C. K. Tweedie (cf. London Math. Soc., Vol. 21, (1946) pp. 22-28] on the 
regression of the sample variance on the sample mean. 


46. Distribution of Sample Size in Sequential Sampling. L. L. Lasman, Florida 
State University anp E. J. Wruu1aMs, North Carolina State College. 


Suppose it is desired to sample sequentially from a mixture of s populations and to cease 
sampling when some criteria have been attained. Suppose further that these criteria can 
be specified in terms of a function of the numbers of observations obtained in the sampling 
process and that an observation can be identified by population only after it has been drawn. 
Then it might be desired to estimate the average size sample that would be needed to satisfy 
the criteria. Under certain assumptions on the functions involved, the asymptotic distribu- 
tion of the sample size for such a procedure is obtained through the use of Wald’s funda- 
mental identity. The mean and variance turn out as relatively simple functions of the 
criteria specified and of the mixture probabilities. Sampling until the standard error of 
the difference between two means reaches a given value, is given as an example for the case 
s = 2. 





238 ABSTRACTS 


47. Generalizations of Thompson’s Distribution, I]. Anpre G. Laurent, 
Wayne State University. 


Let X be an X p random matrix with probability density f(X) = h(X’'X), beak xX p 


submatrix of X, x’X = TT’, T lower triangular, Y = XT’"', 1 = ET’, (Schmidt's orthog- 
onalization process), the marginal and conditional (given X’X) distribution of X, &, Y, 
n, E’E, Y’Y, n’n are derived. For example, f(—| X’K) = K|I — (X‘'X)~'&’E |(v-*-e-pe2 
| X’X |-*/2, f(m) = Cl I — a’n |(e-*-?-)", p S n — k, in the proper domain. Applications 
are described (U.M.V. unbiased estimates, roots of determinantal equations, bombing 
problems, etc.). In case n = p, Cayley parametric representation U of Y, U = U* (Y ex- 
ceptional), U = U** (Y non exceptional) is A | J + U |~‘*-» dU distributed (Haar in- 
variant measure). The problem of constructing a ‘‘random”’ basis in the euclidean space 
is considered. Other lines of generalization of Thompson’s distribution are studied. The 
results generalize (and sometimes specialize) results previously given (Ann. Math. Stat., 
1956, p. 1184; Journ. Soc. Stat. Paris, 1955, pp. 262-296; Journ. Oper. Res. Soc., 1957, pp. 
75-89). 


48. Optimum Decision Procedures for a Poisson-Process Parameter. James 
A. Lecuner, University of Maryland. 


Rules are discussed for deciding whether \, the parameter of a continuous-time Poisson 
process, is less than or greater than a given constant k. If the cost of observation is propor- 
tional to the length of time the process is observed, and the cost of a wrong decision is pro- 
portional to the magnitude of the error, that is, to | \ — &| , then an optimum non-ran- 
domized sequential decision procedure is proved to exist and is found, where by an optimum 
procedure is meant a procedure which minimizeg she total expected cost with respect to 
any given prior distribution for \ of the incomplete Gamma form with mean y/t and vari- 
ance y/t?,t > 0, y a positive integer. Some of the results hold time for other cost functions, 
prior distributions, and/or random processes; indications are made of some of these. 


49. Reduction of Multiple Regression System by Use of Direct Products of 
Matrices. Jutius Liesiem, U. 8. Navy Department. 


Let R represent a high order multi-variable polynomial regression, such as might occur 
in a large factorial experiment. Let the independent variables of R be arbitrarily separated 
into two groups, S and S’, and suppose R is arranged according to ascending powers of the 
variables in S. For every fixed set of values of the variables in S’, this gives a new regres- 
sion R(S | 8’), with coefficients C(S’), depending on S’. These coefficients C(S’) may then 
themselves each be taken as a polynomial regression R(S’) over the variables in S’. Let 
the matrices of the normal equations for the three systems of regressions given by R, 
R(S | 8S’), R(S’) be, respectively, M, Ms , Mg . Then it was essentially shown by E. A. 
Cornish (Biometrics, March 1957, pp. 19-27) that (*) M = Ms @® Msg (Kronecker or direct 
product), in a certain class of cases. The present paper generalizes this relationship to any 
number of sets S, 8’, S’’, --- , and to other regressions than polynomials and finds the con- 
ditions for it to hold or not. Considerable savings in computation, and a mathematical 
check on regression calculations, were shown by Cornish (ibid.) to be possible when (*) 
holds. In general, however, R will have missing terms corresponding to non-significant 
interactions and (*) will not hold. The present paper shows also how to obtain computa- 
tional savings and a mathematical check even in such cases, especiaily in conjunction with 
high-speed digital computers. 





ABSTRACTS 239 


50. On the Characterization of a Family of Populations which includes the 
Poisson Population. Evcene Luxacs, Catholic University. 


A random variable Y which has finite expectation is said to have constant regression on 
a random variable X if the relation E(Y | X) = EH(Y) holds almost everywhere. The k- 
statistic of order j is the symmetric, homogeneous polynomial statistic whose expectation 
is the j-th cumulant; it is denoted by k; . The following theorem is proved: Let X; ,X:,-°-- , 
X, be a sample of size n taken from a population with distribution function F(z). Let p 2 1 
and r 2 1 be two positive integers and assume that the moment of order p + r of F(z) 
exists. The distribution F(z) is the convolution of a Poisson Distribution, the conjugate 
to a Poisson Distribution and a normal distribution if, and only if, k,,, — k, has constant 
regression on k, . (One or two of the components of F(z) may be absent). 


51. On Queues in Tandem. Grecory E. Masterson, Burroughs Research 
Center AnD Seymour SHERMAN, University of Pennsylvania. 


A queueing system is considered which consists of an infinite number of identical servers 
in tandem. The service times for all customers and all servers are independent random 
variables with identical probability distributions. The distribution is arbitrary, except 
that it has a finite mean. The interarrival times of customers at the input to the system i.e., 
at the first server, are also independent random variables with identical probability dis- 
tributions. Again, the distribution is arbitrary, except that it has a finite mean. When a 
customer has been served, he immediately proceeds to the next server, where he may have 
to join a queue if that server has not yet finished serving the previous customers. Customers 
may, of course, have to queue at the input to the system. The service discipline is ‘‘first 
come, first served.’’ It is shown that the chance that the interdeparture time between the 
jth and the j + 1th customer, from the nth server, is less than z, tends to zero as n tends 
to infinity for each positive z, except in the unique case of constant service times. 


52. Power Characteristics of the Control Chart for Number of Defects, No 
Standard Given. Epmunp M. McCusz, Ohio University. 


A standard procedure for testing an industrial process for control with respect to de- 
fects-per-unit is to compare the numbers of defects observed in each of k samples with 
upper and lower control limits based on the total number of defects in the k samples. Meth- 
ods are given for obtaining the probability of a Type I error and the power to detect single 
slippages. It is found that there is considerable variation in the probability of a Type I 
error and power for various values of k. Utilizing this information, procedures are developed 
to increase the effectiveness of the control chart for number of defects. Some asymptotic 
properties of the power are obtained, and it is shown how approximations based on these 
properties can be used in practice. 


53. On the Distribution of the Sum of Circular Serial Correlation Coefficients 
and the Effect of Non-Normality on its Distribution. V. K. Murruy, Uni- 
versity of North Carolina. 


Let 2 , 22, *** , Zoma1 be a random sample of size (2m + 1) from a normal distribution 
with zero mean and unit variance. Let r, denote the circular serial correlation coefficient 
of lag L defined by rz, = Dif" z,2;.1/D72i'/z4, where 2; = 2im4:4; for all j. Define # = 
Dir» r;/(2m + 1). It is then shown that the distribution of 7 is a beta-distribution. The 





240 ABSTRACTS 


effect of non-normality on the distribution of 57%, r, is studied by the method of David 
and Johnson (Ann. Math. Stat., 1951). 


54. Generalized Power Series Distribution and Certain Characterization 
Theorems. G. P. Parit, University of Michigan. 


Let T be an arbitrary countable non-null subset of non-negative numbers and define the - 
generating function {(@) = Dzer a.@ with a, = 0; @ 2 0 so that f(@) > 0, is finite and dif- 
ferentiable. Then we can define a random variable X taking values in 7 with probabilities 
Prob {X = z} = (a,@)/({(@)), ze T and call this distribution a Generalized Power Series 
Distribution (gpsd). The Binomial, Poisson, Negative Binomial and the Logarithmic 
Series distributions and their truncated forms can be obtained as special cases of the gpsd 
by proper choice of 7’, a, and hence of f(@). Recurrence relations are obtained for central, 
raw and factorial moments, cumulants etc. which are generalizations of corresponding 
results obtained by Romanovsky, Frisch, Haldane, ete. An explicit functional relationship 
between the variance and the mean of a gpsd is obtained and based on this relationship, 
some characterization theorems are presented. To mention one, the gpsd with equal vari- 
ance and mean for all admissible parameter values is characterized to be Poisson distribu- 
tion. Some problems of estimation and others have been studied for the gpsd and will be 
presented elsewhere. 


55. Stationary Probabilities for a Semi-Markov Process with Finitely Many 
States. Ronatp Pyke, Columbia University. 


A process {Z,:t 2 0} is called a Semi-Markov process (S.-M.P) if, roughly speaking, it 
moves from one to another of m(S ~ ) states in accordance with a transition matrix as does 
a Markov Chain, but where the time between two successive transitions may depend on 
the states between which the transition is being made. These processes are then generaliza- 
tions of both discrete and continuous parameter Markov Chains. Let J, denote the state 
entered at the n-th transition and let X, denote the time taken between the (nm — 1)-th 
and n-th transitions. An 8.-M.P. (or alternatively, a Markov Renewal process) is said to 
be regular if for all choices of initial probabilities, N(t) = sup {n = 0;X, + X¥.+--- + 
Xn St} < @ a.s. for every ¢t 2 0. Almost all sample functions of a regular process are 
step functions. A characterization of, as well as several sufficient conditions for regularity 
are derived. A classification of states analogous to that for Markov Chains is presented 
and studied. Limit theorems are proven under weak restrictions for random variables 
W(t) = Dro f(Ja-1, Jn, Xn), for arbitrary real functions f defined on R;. Further- 
more, the a.s. convergence of ratios (Doeblin Ratios) of the form W,(t)/W,(t) is studied. 


56. On the Decomposition of Certain Characteristic Functions (Preliminary re- 
port). B. RamMacHANDRAN, Catholic University. (Introduced by Eugene 
Lukacs.) 


A family of characteristic functions is said to be factor-closed if the factors of every 
element of the family belong to the family. It is known that the Normal, the Poisson and 
the Binomial families are factor-closed. Recently Yu. V. Linnik (Teor. Veroyat. 2, 1957) 
proved that the characteristic functions of the compositions of a Normal and a Poisson 
distribution form a factor-closed family. In the present paper it is shown that the charac- 
teristic functions of compositions of a (standard) Poisson and a (standard) Binomial distri- 
bution constitute a factor-closed family. An example is given to demonstrate that the char- 
acteristic functions of the compositions of a Normal and a Binomial distribution do not 
form a factor-closed family. 





ABSTRACTS 241 


57. Generalization of a Theorem of Polya, and Applications. R. Ranea Rao, 
Indian Statistical Institute. (Introduced by R. R. Bahadur.) (By title) 


Let w.(n = 1, 2, ---) and ws be measures on the Borel sets of a separable and complete 
metric space X. Let F be a given family of continuous mappings from X into the k-dimen- 
sional Euclidean space E, . Let ® be the class of all sets of the form f-'(R), with fe F 
and R a k-dimensional rectangle. Theorem 1. Suppose that F is compact under the topology 
corresponding to uniform convergence on compacta, and that yf has continuous marginal 
distributions for each f e F. Then wu, — mw (weak convergence) implies that sup {| #.(A) — 
u(A)|, Ae ®R} +0. (When X is the real line and F consists of the single function f(z) = 
z, this theorem reduces to a wel! known theorem of Polya). Theorem 2. If X = Ey, and 
uw << Lebesgue measure, then uw, — uw if and only if sup {| 4.(C) — w(C)|, C measurable 
and convex| — 0. As an application, we have the following generalization of previous results 
of Wolfowitz, and of Fortet and Mourier. Let &, e By (n = 1, 2, ---) be independent ran- 
dom vectors with common distribution uw. Let wu, be the sample distribution function based 
on the first n observations & , --- , & . For each fixed positive integer m, let 3C,, be the class 
of all sets which are intersections of m half-spaces. Then sup {| w.(A) — w(A) |, Ae XR, 
— 0 with probability one. If 4 << Lebesgue measure, then sup {| wa(C) — w(C) |, C meas- 
urable and convex} — 0, with probability one. 


58. The Method of Moments Applied to a Mixture of Two Exponential Dis- 
tributions. Pau. R. Riper, Aeronautical Research Laboratory. 


The method of moments is used to estimate the parameters of a mixed exponential dis- 
tribution. Variances of the estimators are derived. 


59. When to Stop. Hersert Rossins, Columbia University. (By title) 


Let {z,} be independent random variables with a common distribution function PF. We 
observe the z, sequentially and can stop at any time; if we stop with z, we receive the 
payoff f(z: , --- , 2»). Problem: what stopping rule maximizes the expected payoff? It is 
shown that for f,(z , ++: , Zn) = max (2, +--+ , 2.) — en, c > 0, the optimum stopping 
rule when the first moment of the z, exists is: stop with the first z, > a where a is the 
root of the equation Siz — a)* dF (x) = c; the expected payoff is then a. 


60. On Estimating the Mean of a Finite Population. J. Roy, Indian Statistical 
Institute, anp I. M. Cuakravarti, University of North Carolina. (Intro- 
duced by R. C. Bose.) 


Consider a population consisting of a finite number N of distinguishable elementary 
units u; with associated real numbers (variate-values) y; i = 1, 2, ---, N. Let 
the mean and the variance of the population be respectively » = 1/N Diu yw 
and o = 1/N Dihi (we — w)*. Let {U} denote a countable collection of derived 
units U(z) z= 1, 2, formed by combining the elementary units. Only one 
of the derived units is to be selected, the probability of selecting U (xz) being p(z) and the 
variate-values for all the elementary units in the selected derived unit are to be determined. 
The estimate of wis the random variable T = t(X) = S72, ysas(z) where Prob (X = z) = 
p(z) z = 1, 2, --- and the set of coefficients a;(z) associated with a derived unit U (z) 
are chosen so that 7 is an unbiased estimate of yw and has finite variance. In this paper an 
admissible estimate and a complete class of estimates of « have been obtained. If the sam- 
pling scheme is ‘‘balanced’’, a best estimate of « in the class of linear unbiased estimates 7 
which have variances proportional to o? is shown to exist. 





242 ABSTRACTS 


61. A Solution of the Classification Problem. 8. N. Roy, University of North 
Carolina. (By title) 


For one-way classification the problem is the following. Given k observed random samples 
of experimental units (on each of which p kinds of observations have been made) drawn 
from k populations with known distribution forms but unknown parameters, and given 
another experimental unit carrying p kinds of observations, how to assign the experimental 
unit to one of the k populations? A heuristic solution of this problem is offered when the 
k-populations are p-variate normal ones (with unknown parameters), and this is then ex- 
tended to the case of two-way or multi-way classification under the models of multivariate 
analysis of variance. The method offered is then formally extended to one~ or multi-way 
classification problems under distribution forms, not necessarily normal, with p = 1 one 
would have the univariate case. A comparison with the solution (not yet available) in 
terms of the general decision function approach is desirable, for it is felt by the author 
that the latter solution, while much more difficult, would be better and more rational than 
the easier but heuristic one offered here. 


62. On the Determinants and Characteristic Equations of a Class of Patterned 
Matrices. 8S. N. Roy, B. G. Greenperc, anv A. E. Saruan, University 
of North Carolina. 


In three previous papers by the authors, inverses were given of a class of patterned 
matrices that occur in a wide variety of problems, including those in univariate and multi- 
variate analysis of variance, the exploration and study of response surfaces and the handling 
of censored data. For some aspects of these problems one may need to obtain, in addition 
to the inverses, (i) the determinants, (ii) the characteristic equations and (iii) the char- 
acteristic roots of these patterned matrices. This paper obtains (i) and (ii), and in forms 
that turn out to be nearly as simple and patterned as the inverses obtained earlier. For 
special values of some of the parameters involved, (iii) comes out in a simple form, but for 
the more general cases one has to resort to the numerical solution of the characteristic 
equations (using any of the various methods in vogue), in order to obtain the characteristic 
roots. These roots, both for the general and the special cases, again happen to be patterned 
in the same sense as the inverses and the characteristic equations. 


63. On the Efficiency of Experimental Designs. 8S. N. Roy, 8. 5. SHRIKHANDE, 
AND P. R. Krisunaian, University of North Carolina. 


With the randomized block design furnishing the yardstick, the efficiency of two dimen 
sionai designs has been studied from the standpoint of point estimation in the case of a 
single response type and under certain further well-known restrictions. It is the purpose 
of this paper to start a study of efficiency, for a single response type, from the viewpoint 
of (i) the power function, (ii) the confidence bounds, both total and partial, on parametric 
functions measuring departures from the total and partial hypotheses, and also (iii) point 
estimation under assumptions broader than usual, and furthermore to generalize this study 
to the case of experiments with multiple response types. Most of the requisite basic concepts 
are discussed here, and detailed formulae are given for some classes of BIB and PBIB de- 
signs. Further work along the same lines is underway. 


64. The Estimation of the Location of a Discontinuity in Density. Herman 
Rustin, Michigan State University. 


Let 2, , --- , 2, be independent multivariate random variables with common density of 
the form ¢;(x | 6) for z e R;(@). Then under suitable regularity conditions, hyperefficient 





ABSTRACTS 243 


estimates, including maximum likelihood estimates, exist for the parameters of the R, . 
The results are similar to those obtained by Chernoff and Rubin in ‘‘The estimation of the 
location of a discontinuity in density,’’ Proc. Third Berkeley Symposium on Math. Stat. 
and Prob., Vol. 1. The limiting distribution of the estimates involves a study of stochastic 
processes with multidimensional ‘‘time’’ which have some Markov properties. 


65. A Modified Procedure for Group Testing. Mitton Sopet, New York Uni- 
versity and Bell Telephone Laboratories. 


In group-testing a binomial sample of size N is given and any number z(1 S z S N) of 
units can be tested simultaneously. Each test determines either that all z units are good 
or that at least one defective is present (it is not known how many or which ones are bad). 
It has been shown in [1] that for known a priori probability ¢g of a unit being good a pro- 
cedure R, based on recursion formulae is optimal under a certain restriction, namely that 
in selecting a group to be tested one should not miz “binomial’’ units with units from a 
set known to contain at least one defective. A modified procedure Ry is now developed 
which allows a certain ‘“‘small’’ amount of mixing and it furnishes an improvement over R, . 
‘The extent of the improvement is numerically investigated for selected values of N and q. 
It is not yet known whether (or to what extent) the procedure R, is optimal in the unre- 
stricted case. 


66. A Problem in Restrictive Group-Testing. Miuron Sope, anp Puy iis A. 
GroLL, Bell Telephone Laboratories. 


In group-testing a binomial sample of size N is given and any number z(1 S$ z S N) of 
units can be tested simultaneously. Each test determines either that all z units are good 
or that at least one defective is present (it is not known how many or which ones are bad). 
In some applications of group testing it is desirable to apply the restriction that any one 
unit not be included in more than k group tests. Based on recursion formulae, a group test- 
ing procedure is developed, for known a priori probability ¢ of a unit being good, which 
satisfies the above restriction. In the special case k = 2, it is clear that for any set contain- 
ing at least one defective, each unit in this set must be tested separately; this type of solu- 
tion was proposed by Dorfman for the unrestricted problem. In the special case k = 3, 
tables and explicit rules are prepared for all values of g up ton = 8. One particular applica- 
tion is the field of pooled blood testing, where the restriction insures that it will not be 
necessary to take more than one blood sample from each patient. 


67. On the Probability of Detection of Noise-Like Signals. W. M. Srone, 
Boeing Airplane Co. and Oregon State College, anp K. J. Hammer.e, 
Boeing Airplane Co. (Introduced by J. Bryce Tysver.) 


The classical paper of Kae and Siegert (J. Appl. Phys., 18: 383-397) dealt with the de- 
tection of nonrandom signals by a receiver system. In the present paper the signal is as- 
sumed to be a randem process with a prescribed spectrum. A properly chosen adjustment 
on the transfer function of the bandpass filter has the effect of modifying the distribution 
of the output of the system in terms of the signal bandwidth. Third order cumulants are 
obtained, also suitable approximations to the probability of detection. 


68. Identifiability of Mixtures. Henry Teicner, Purdue University. 


If F = {F(z; a), a e R™| is a family of distribution functions (c.df.’s) and G = [G(a)}, 
a class of non-degenerate m-dimensional c.d.f.’s then a class 1. = {H} of G-mixtures of F 


(i.e. (*) H(z) = SP (z;a) dG(a)) is called identifiable if (+) effects a 1-1 correspondence 





244 ABSTRACTS 


between 3C U F and G U J where J is the class of degenerate distributions assigning mass 
one to a single point of R™, (“On the Mixture of Distributions,’”’ Ann. Math. Stat., March, 
1960). It is shown that if m = 1 and § is an additively closed family with a varying over 
the non-negative integers, rationals or reals then the induced class 5C of mixtures is identi- 
fiable. Some scale paraneter mixtures not therein encompassed may be handled by a method 


indicated. Applications are made to the classical families of Gamma, Uniform and Binomial 
distributions. 


69. On the Problem of Negative Estimates of Variance Components. WILLIAM 
A. Tuompson, Jr., University of Delaware. 


The usefulness of variance component techniques is frequentiy limited by the occur- 
rence of negative estimates of essentially positive parameters. This paper demonstrates 
that the principle of maximum likelihood, properly applied, will remove this objectionable 
characteristic in certain cases. From a conceptual viewpoint, the solution of the problem 
of negative estimates of variance components, at least in so far as maximum likelihood is 
concerned, is that the likelihood function should be maximized subject to the constraints 
that all variances should be non-negative. The results of Kuhn and Tucker on nonlinear 
programming greatly facilitates carrying out the mechanics of this objective. The technique 
has been successfully applied for the following random models: one and two factor experi- 
ments with multiple observations in each cell, two factor experiment with a single observa- 
tion per cell, and the n-fold hierarchal classification. The problem of determining the 
precision of instruments in the two instrument case [Grubb, J.A.S:A., 1948] is dealt with, 
and a surprising though not unreasonable answer is obtained. 


70. An Infinite Packing Theorem for Spheres: A New Application of the Borel- 
Cantelli Lemma. Oscak Wesier, University of Michigan. 


A classic example (Wolff, 1921) in the theory of functions of a complex variable involves 
removing a sequence of disjoint circles from a given circle in such a way that only a set of 
measure zero remains behind. Borel observed indirectly that the areas of sach circles neces- 
sarily form a convergent series whose rate of convergence is less rapid than that of the 
series with general term a and wondered what the order of magnitude of these circles 
might be, and whether one could determine it directly. It turned out that Borel’s bound 
was incredibly weak: the convergence is actually so much slower that the radii of the circles 
form a divergent series! Various proofs have been given using the relatively heavy machin- 
ery of complex function theory. In this paper a direct and simple proof is given using the 
easier half of the Borel-Cantelli lemma. In fact, our method is such that it yields at once 
a result of much greater generality: we show that the infinite packing theorem just men- 
tioned holds not only for circles in the plane, but that analogous results hold for spheres in 
n-space, as well as for more general figures. 


71. On Time Series Analysis and Reproducing Kernel Spaces. N. DoNnaLp 
YuvisakeEr, Columbia University. 


Let X(-) be a real value.|, weakly stationary of second order, continuous parameter 
process with &[|X(s)X(t)| = K(s, t) = k(s — t). The reproducing kernel space H(K) of 
functions, which is associated with the kernel K, is a representation of the process X(-). 
The realization of the group of unitary operators in H(K) is given. The properties of func- 
tions in H(K) are related to the properties of the kernel K and, in particular, a sufficient 
condition is given that H (K) consist of quasi analytic functions. The notion, due to Kolmo- 
gorov, of processes subordinate to X(-) is treated, and the reproducing kernel space cor- 





ABSTRACTS 45 


responding to a subordinate process is characterized relative to H(K). The linear extrapo- 
lation problem is viewed in H(K). Specifically, necessary and sufficient conditions are given 
that H(K) correspond to a deterministic, non-deterministic, or regular non-deterministic 
process. Sufficient conditions are given in this context, that a process subordinate to a 
deterministic ‘regular non-deterministic) process be itself deterministic (regular non- 
deterministic). A subordinating operation is given for which the subordinate process is 
deterministic and mutually subordinate with the original process X(-). 


72. Some Randomization Consequences in Balanced Incomplete Blocks. 
Georce Zysxinp, University of North Carolina and Iowa State Uni- 
versity. 


The analysis of balanced incomplete blocks is developed directly from the randomiza- 
tion consequences of the experimental procedure and under the general case of no addi- 
tivity assumptions. It is shown that expected values of squares of partial observational 
means, as well as the expected values of products of individual observations, admit simple 
and easily specifiable expressions in terms of Z’s—linear functions of the population vari- 
ances, uniquely determined by the structure of the experiment. The expected values of 
mean squares in the analysis of variance tables and the expression for the average variance 
of estimated treatment differences are then derived as a simple consequence. Extension 
to the case where the intended amounts of treatment amounts are subject to error are indi 
cated. The correlational structure of the observations under the simplifying additivity 
and/or homogeneity assumptions is examined. The relationship of this structure with the 
ones generally given in connection with assumed models is exhibited. Some estimation 
problems are discussed. 


73. Optimum Experimental Designs. J. Kierex, Cornell University, (Invited 


paper). 


Let fi , --- , fe be functions on a space X. We consider the regression problem where an 
experiment at z yields an observation with expectation 24f;(z). A design is (approxi 
mately) a probability measure — on X which describes the proportion of observations to 
be taken at each value z. Let M(£) be the matrix of elements fd dt, and write M, for the 
lower right-hand (k — s) K (k — s) submatrix of M. Write {® for the vector of the last 
k — s functions of the vector f of f;’s. The results of Kiefer and Wolfowitz (Canadian J., 
1960) are generalized to the case where we are interested in s < k parameters: Theorem. 
The following are equivalent if M(&*) is nonsingular (with an analogous result in the 
singular case): (1) &* minimizes the generalized variance of the best linear estimators of 
6; , --- , 0 ; (2) &* minimizes max, d(z, £), where 


d(x, &) = f(x)'M—(E)f(z) — f™ (x)'Mz" (E)f (2); 


(3) max, d(z, &*) = s. A characterization of the set of all such £* is also given. The results 
complement those of Kiefer and Wolfowitz (Ann. Math. Stat., 1959), and yield improved 
computational techniques in many cases. Numerous applications are given, e.g., to prob- 
lems of polynomial regression on a qg-dimensional cube or simplex; in particular, it is shown 
which of the designs considered by Scheffe (J.R.S.S. (Ser. B), 1958) are optimum. 


74. Semi-Markov Processes: Countable State Space. Ronatp Pyke, Columbia 
University, (Invited paper). 
Let A = (a; , G2, +--+ , dm) be a vector of m < « probabilities and let Q and @ be two 


matrices of transition distributions. Let {(J, ; X,):n 2 0} be a process satisfying X» = 
0, PiJo = k]) = a, Pili = 7, X1 Sz) Jo =i] = @i;(z) and forn > 1, PiJ, = j,Xn& 





246 ABSTRACTS 


Z| Jo,di,°*+ Ina, Xi, +++ Xn] = Qy,.,j(2). Define N(t) = sup {k 2 0:X, + Xi+--- 
+ Xi S t} and Z; = Jy» . The process {Z,:t 2 0} thus defined is called a general Semi- 
Markov process (G.8S.-M.P.) determined by (m, A, é, Q). Essentially, therefore a G.S.- 
M.P. is an 8.-M.P. with random starting conditions. For the 8.-M.P. determined by 
(m, A, Q), define Rix(z; t) = P(Z: = j, Inu = k, Surons S t+ 2| Zo = ij. The com- 
plete limiting behavior of this function as t — « is obtained. In particular, if state j is 
recurrent and G;;, the recurrence time distribution of state j, is non-lattice, then 
lime. Rie(z; t) = cia} fi (Qin(+ 2) — Qye(y)] dy where c;; is the probability of reach- 
ing state j from state 7, and where y;; is the mean recurrence time of state j7. From this 
result, it is possible to obtain specific quantities A and § such that the G.S.-M.P. deter- 
mined by (m, A, é, ©) has the property that the three dimensional ‘“‘age’’ process 
(Juin » Incoar , Surya — t):t & O} is a wide sense stationary process. 


75. Generalized Bayes Solutions in Estimation Problems. Jerome Sacks, 
Columbia University, (Invited paper). 


For simplicity consider the estimation on the basis of a sample of size one of the mean 
w of a normal distribution with variance one with the loss function being squared error. If 
¢ is an a priori distribution then the Bayes estimate with respect to ¢ is Ew (the a poste- 
riori expected value of w). Let F be a distribution function whose total variation over the 
space 2 = {| is infinite but having the property that E Zw is finite for all z. Call Efw a 
Generalized Bayes Solution (G.B.S.). These G.B.S. arise as limits of ordinary Bayes solu- 
tions. In case © is a half-infinite interval it can be proved that the class of G.B.S. together 
with the class of B.S. form a complete class. A sidelight of these considerations is this: Take 
Q = [0, <) and F tobe Lebesgue measure on Q, then an admissible minimax estimate of w is 
EFw. These notions are extended to other loss functions. For some classes of distributions 


other than the normal class a complete class theorem of the type mentioned above is 
proved. 





NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest. 
Personal Items 


Dr. David W. Alling has completed a period of graduate study in statistics 
at Cornell University and is now associated with the Therapeutic Trials Section, 
National Cancer Institute, Bethesda, Maryland. 

Lt. David R. Barr has been appointed to an instructorship at the USAF 
Academy. 

Dr. Richard E. Beckwith will join the Aeronutronic Division of the Ford 
Motor Company next month, as Supervisor of their Operations Research Sec- 
tion. 

Gunnar Blom, formerly Statistician at MAB och MYA Textile Factories, 
Malmoe, and at Agricultural Research Institute, Swedish Sugar Co., Malmoe, 
has been appointed Docent in Mathematical Statistics at the University of 
Lund, Lund, Sweden. 

Case Institute of Technology, Cleveland, Ohio, announces: Dr. R. C. Bose, 
on leave of absence from the University of North Carolina, has joined Case as 
a Visiting University Professor in Statistics. 

Dr. Irwin D. J. Bross, who for the past seven years was Statistical Consultant 
at Cornell University Medical College and the Sloan-Kettering Institute for 
Cancer Research, has been made Chief of the Department of Statistics of 
Roswell Park Memorial Institute for Cancer Research in Buffalo, New York. 

Dr. L. Dennis Cannon, who received the Ph.D. degree in Industrial Psy- 
chology from Purdue University on July 31, 1959, is now employed by the 
Armor Human Research Unit, a branch of George Washington University 
Human Resources Research Office. 

Gregory C. Chow has been appointed associate professor in the Graduate 
School of Business and Public Administration, Cornell University. 

Gertrude M. Cox has resigned as Director, Institute of Statistics of The 
Consolidated University of North Carolina. She will continue for the present as 
Professor of Statistics at North Carolina State College. Dr. Cox assumed a 
new position as Head, Statistics Research Division, The Research Triangle 
Institute on January 1, 1959. 

Bruce A. Drew has been recently appointed Experimental Design Statistician 
for Research and Quality Control Division, Refrigerated Products, The Pills- 
bury Co., New Albany, Indiana. 

Dr. Henry D. Friedman, formerly with General Electric Co., Syracuse, New 
York, is now with Technical Operations, Inc., Burlington, Massachusetts. 

Alan H. Gepfert now resides at 420 Church Street, Evanston, Illinois with 
his bride of August 26, Mary Bosworth of Ohio. Gepfert is now providing a 
regular monthly series for the Modern Railroads publication entitled ““Modern 
Railroads Chart the Railroad Business.” 


247 





248 NEWS AND NOTICES 


Sister Catherine J. Gillis received the Ph.D. degree in Statistics with the 
dissertation: Some Problems Concerning The Wolfowitz Two-Sample Test. 

Irwin Guttman is now at McGill University as Associate Professor of Statis- 
tics in the Department of Mathematics. 

William L. Harkness has completed the requirements for the Ph.D. degree 
in Mathematical Statistics at Michigan State University and has accepted a 
position as Assistant Professor in the Department of Mathematics at Penn- 
sylvania State University. 

William G. Howard, former Assistant for Operations Analysis, Hq., USAF, is 
now employed as Operations Analyst in The Operational Sciences Laboratory 
of the Research Triangle Institute, Durham, North Carolina. 

Dr. Robert Hultquist received the Ph.D. degree in Mathematical Statistics 
from Oklahoma State University on August 8, 1959 and is now an Assistant 
Professor of Mathematics at DePauw University, Greencastle, Indiana. 

Donald A. Jones, State University of lowa, has been appointed assistant 
professor in the Mathematics Department of the University of Michigan. 

Dr. A. R. Kamat, who was still recently Professor of Mathematics and Statis- 
tics at the Fergusson College and the B. M. College of Commerce, Poona- 4, 
India, is now Research Professor of Theoretical and Applied Statistics at the 
Gokhale Institute of Politics and Economies, Poona- 4, India. 

Professor Leo Katz of Michigan State University will spend the academic 
year 1959-60 in the O.N.R. London Branch Office as Scientific Liaison Officer 
(Mathematical Statistics). Visiting statisticians are invited to drop in at Keysign 
House, 429 Oxford Street, W.1. He is living at 7 Porchester Gate, W.2 and 
may be reached there by dialing Bay 4108. 

Dr. L. O. Kattsoff has become professor of Mathematics and Logic at Boston 
College, Chestnut Hill, Boston, Massachusetts. 

Robert A. Koenig, formerly statistician with National Lead Company, 
South Amboy, N. J., is now employed by the Data Processing Division of the 
Royal McBee Corporation, 913 Penn. Avenue, Pittsburgh, Pa., as a computer 
applications analyst. 

Albert Mindlin has transferred from the position of Technical Assistant to 
the Chief, Statistics Branch, Bureau of Old Age and Survivors Insurance, U. 8. 
Department of Health, Education, and Welfare, to Chief Research Statistician, 
Department of General Administration, Government of the District of Columbia. 

G. Baley Price is on leave from the University of Kansas, while spending the 
current academic year as a Visiting Professor at the California Institute of 
Technology. 

M. M. Rao has accepted a position as Research Mathematician in the Mathe- 
matics Department of Carnegie Institute of Technology. Dr. D. K. Ray-Chaud- 
huri has been appointed a Research Associate in the Statistical Laboratory. 

Howard R. Roberts has resigned the position as operations analyst in the 
Management Systems Division of the Johns Hopkins University Operations 
Research Office in order to accept a position as mathematical statistician in the 
Washington office of Booz-Allen Applied Research, Inc. 





NEWS AND NOTICES 249 


Murray Rosenblatt, formerly Associate Professor at Indiana University, has 
been appointed Professor of Mathematical Probability and Statistics in the 
Division of Applied Mathematics, Brown University, Providence, Rhode Island. 

Elizabeth A. Shuhany received the Ph.D. degree from Boston University in 
Statistics with the dissertation: The S;-Test Against Linear Trend. 

Professor Jack Silber has returned from Europe where he was Consultant to 
the Operations Analysis Office, Hq. United States Air Forces in Europe. 

Dr. B. V. Sukhutme has taken leave of absence from the Institute of Agricul- 
tural Research Statistics, New Delhi from September 1959 to spend a year as 
Visiting Associate Professor in the Department of Statistics at Michigan State 
University. 

Howard G. Tucker, Assistant Professor of Mathematics at the University 
of California, Riverside, is on leave of absence for the academic year 1959-60. 
He is spending the year at the Berkeley campus on a research project. ‘Stochastic 
Models for Carcinogenesis,” under the direction of Professor J. Neyman. 

Madanlal T. Wasan has joined the Department of Mathematics of Queen’s 
University, Kingston, Ontario, Canada as Assistant Professor. 


NEW MEMBERS 


The following persons have been elected to membership in the Institute 


Abraham, John K., M.A. (University of California), Research Assistant, University of 
California Statistical Laboratory, Department of Statistics, Berkeley, California 

Ammeter, Hans, Schweizerische Lebens-versicherungs-und Rentenanstalt, Zurich, Alpenquai 
40, Switzerland. 

Appleby, Robert H., B.S. (Washington College), Graduate Student, Virginia Polytechnic 
Iv -titute, Blacksburg, Virginia; 709 Main Street, Apt. C-1, Blacksburg, Virginia. 
Barton, David Elliott, Ph.D. (London University), Lecturer, Department of Statistics, 

University College London; University of London, Gower Street, London W.1., England. 

Beatty, James K., M.S. (Ohio University), Head, Theoretical Analysis Sub-Group, Thio- 
sol Chemical Corporation, Elkton Division, Elkton, Maryland; Maple Square Trailer 
Park, Rd. 2, Newark, Delaware. 

Bergstrand, Karl-Georg, Fil. Kand. (University of Stockholm), Mathematical statistician, 
Skanska Attikfabriken, Perstorp, Sweden. 

Bergstrom, Harold, Ph.D. (University of Uppsala), Head, Applied Mathematics, Charles 
University of Technology, Institute of Applied Mathematics, Gotebor, Sweden. 

Bhargava, Triloki Nath, M.Sc. (University of Lucknow), Research Assistant, Department 
of Statistics, Michigan State University, East Lansing, Michigan. 

Bolland, Thomas W., M.B.A. (University of Chicago), Research Analyst in Field of Mar 
keting Research, Market Facts, Inc., 39 So. LaSalle Street Chicago 3, Ill., 213 Elgin, 
Forest Park, Illinois. 

Buhimann, Hans, Ph.D. (Swiss Federal Institute of Technology), Actuary, Swiss Re 
insurance Company, Mythenquai 60, Zurich, Switzerland. 

Burton, Robert C., B.Sc. (Brigham Young University), Student—National Science fellow, 
Dept. of Statistics, University of North Carolina, Chapel Hill, North Carolina; 3922 
N. 4th Street, Apt. 1, Arlington 3, Virginia. 

Campbell, R. Colin, Ph.D. (University of Cambridge), Lecturer in Statistics and Honorary 
Officer-in-charge of Agricultural Research Council Statisties Group, School of Agricul 
ture, Downing Street, Cambridge, England. 

Chaddha, Roshan L., M.S. (Agra University), Research Assistant, Virginia Polytechnic 
Institute, Bor A-304, V. P. I., Blacksburg, Virginia 





250 NEWS AND NOTICES 


Chung, Sae Ho, M.A. (University of Illinois), Teaching Assistant, Department of Mathe- 
matics, University of Illinois, Champaign, Illinois. 

Conner, Robert J., B.E.S. (The Johns Hopkins University), Operations Research Assist- 
ant, The Johns Hopkins Hospital, 601 N. Broadway, Baltimore 5, Maryland; Indus- 
trial Engineering Department, The Johns Hopkins University, Baltimore 18, Maryland. 

Davis, Miles, 8.M. (Harvard University), Student, 104 Richards Hall, Cambridge 38, Massa- 
chusetts. 

Dhrymes, Phoebus J., B.A. (University of Texas), Student, Department of Economics 
Massachusetts Institute of Technology, Cambridge 39, Mass. 

Dunn, James E., B.Sc. (University of Nebraska), Graduate student, Department of Agron- 
omy, Soil Chemistry Division, University of Nebraska, Lincoln, Nebraska; % Russell 
Dunn, Rural Route 1, DeWitt, Nebraska. 

Esseen, Carl-Gustav, Ph.D. (University of Uppsala), Professor, Division of Applied Mathe- 
matics, The Royal Institute of Technology, Stockholm 70, Sweden. 

Farlie, Dennis J. G., B.Sc. (London University), Statistician, Research Laboratories of 
The General Electric Company Ltd., East Lane, North Wembley, Middlesex, Eng- 
gland; 22 Bouverie Road, West Harrow, Middlesex, England. 

Fellman, Johan Olof, Cand.Phil. (University of Helsinki), Teacher, Tekniska Laroverket i 
Helsingfors (The Technical College of Helsinki), Appollogatan 8-10, Helsinki, Finland; 
Magistervagen 14, Grankulla, Finland. 

Furukawa, Nagata, M.Sc. (Kyushu University), Assistant, Faculty of Science, Kumamoto 
University, Kurokami-machi, Kumamoto-shi, Japan. 

Ganelius, Cord H., Fil.Dr. (University of Stockholm), Professor, Institute for Mathe- 
matics, University of Goteborg, Gotebord, O, Sweden; Kallebacksvagen 19, Goteborg 
S., Sweden. 

Gillis, Paul P., Agrege de |’Enseignement Superieur, (Universite de Liege), Professor, 
President of the institute of Statistics, University of Bruxelles, 50, F. D. Roosevelt 
Avenue, Bruxelles, Belgium; 134 rue de Livourne, Bruzelles, Belgium. 

Henry, Patrick, A.B. (San Diego State College), Research Engineer, Convair Astronautics 
San Diego, California; 591-10 San Diego, California. 

Gundy, Richard F., A.B. (Illinois College), Student, Indiana University Department of 
Psychology, Bloomington, Indiana. 

Gustafsson, Stig S., M.S. (University of Helsingfors), Teacher, Chief Statistician, Fin- 
land Institute of Technology, Abrahamsg, 1-5 Helsingfors, Finland; Stentorpsvagen 
8, A. 8, Nunksnas, Finland. 

Hahn, Gerald J., M.S. (Columbia University), Statistician—Experimental Design and 
Analysis, General Electric Co., General Engineering Laboratory, 1 River Road, Sche- 
nectady 8, New York; 7A2 Sheridan Village, Schenectady 8, New York. 

Hyrenius, Hannes, Ph.D. (Lund University), Professor of Statistics, Statistical Institute 
University of Gothenburg, S. Vaegen 54, Gothenburg S, Sweden. 

Jacobs, Konrad, Dr.rer.nat. (Universitat Munchen), Professor, Universitat Gottingen 
Institute fur Mathematische Statistik und Wirtschafts mathematik, Gottingen, 
Deutschland, Bunsenstr. 3; Gottingen, Deutschland, Dustere Eichenweg 39. 

Jenkins, Gwilym M., Ph.D. (University College, London), Lecturer in Mathematical 
Statistics, Imperial College of Science, University of London, Exhibition Rd., Ken- 
sington, London, 8. W. 7, England; Department of Statistics, Stanford University, Cali- 
fornia. 

Karhunen, Kari Ol, Ph.D. (University of Helsinki), Docent, Institute of Mathematics, 
University of Helsinki, Helsinki, Finland; Topeliuksenkatu 1.A. 17, Helsinki, Finland. 

Krickeberg, Klaus, Dr.rer.nat. (Humbc!<it-Universitat Berlin), Professor, Institut fur 
Angewandte Mathematik der Univers: iat Heidelberg, Tiergartenstr., Heidelberg, Germany. 





NEWS AND NOTICES 251 


Lewis, Tobias, M.A. (Oxon), Lecturer in Mathematical Statistics, University of Manchester, 
Manchester 13, England. 

Lichtefeld, Merle H. (Mrs.), Master of Public Health (University of North Carolina), 
Director—Division of Statistical Services, Kentucky State Department of Health, 
620 8. Third Street, Louisville, Kentucky; 3112 Tremont Drive, Louisville 5, Kentucky. 

Lipow, Myron, B.S. (California Institute of Technology), Head, Realiability Section, 
Propulsion Systems and Development Department, Propulsion Laboratory, Research 
and Development Division, Space Technology Laboratories, Inc., 5500 El Segundo Blvd., 
P.O. Bor 95001, Los Angeles 45, California. 

Mallows, Colin L., Ph.D. (University of London), Lecturer, University College London, 
Gower Street, London, W.1, England. 

Mercer, Alan, M.A. (Cambridge), Senior Scientific Officer, Atomic Weapons Research 
Establishment, Aldermaston, Near Reading, Berks, England. 

Mikulski, Piotr W., Master of Economics, (Main School of Plan., and Stat., Warsaw, Po- 
land); Research Assistant, University of California, Department of Statistics, Berkeley 
4, California; 2:22 Haste, Street, Apt. 10, Berkeley, California. 

Nievergelt, Erwin, Ph.D.II (University of Zurich), Mathematician, Swiss Federal Rail 
ways, Generaldirektion der Schweizerischen Bundesbahnen, Hochschulstr. 6, Bern; 
Fliederweg 4, Koniz (Bern) Switzerland. 

Nolfi, P., Ph.D., Versicherungskasse der Stadt Zurich, Nuschelerstrasse 31, Zurich, 1 Switzer 
land. 

Norton, Wade A., M.A. (George Peabody College for Teachers), Teaching Fellow, Alabama 
Polytechnic Institute, Auburn, Alabama; Mathematics Department, Alabama Poly 
technic Institute, Auburn, Alabama. 

Orlando, Frank P., B.S. (Villanova University), Mathematician, Carpenter Steel, Co., 
Research Division, Front and Bern Streets, Reading, Pennsylvania. 

Perkal, Julian, Professor Doctor (University of Wroclaw), Professor Doctor University 
of Wroclaw and The Mathematical Institute of Polich Academy of Sciences, Instutut 
Matematyezny, Wroclaw, Poland, ul. Kopernika Nr 186; Wroclaw 21, Oporow, Har- 
cerska 24, Poland. 

Puri, Madan Lal, M.A. (Punjah University), Research Assistant, University of California, 
Department of Statistics, University of California, Berkeley 4, California. 

Quesenberry, Charles P., M.D. (Virginia Polytechnic Institute), Student, Department 
of Statistics, Virginia Polytechnic Institute, Blacksburg, Virginia. 

Reinhardt, Howard E., Ph.D. (University of Michigan), Assistant Professor of Mathe- 
matics, Montana State University, Missoula, Montana. 

Rhyne, Alfred L., A.B. (University of North Carolina), Student, University of North 
Carolina, Chapel Hill, North Carolina; 103 Marwell Road, Chapel Hill, North Carolina. 

Robison, Donald Edward, M.S. (Ohio State University), Mathematician, Space Technol 
ogy Laboratory, 5500 W. El Segundo, Los Angeles 45, California; 8908 Reading Avenue, 
Los Angeles 45, California. 

Sakamoto, Takeshi, B.S. (Wakayama University), Graduate Student, Mathematical 
Institute, Faculty of Science Kyushu University, Fukuoka, Japan. 

Sando, Frank D., Final Examination (Association Incorporated Statisticians), Statistician, 
Reed Paper Group, Nalling House, West Nalling, Kent; 12 New Road, Ditton Nr Maid 
stone, Kent, England. 

Saw, John G., B.Sc. (University of Birmingham), Lecturer (Statistics), Department of 
Statistics, University of North Carolina, Chapel Hill, North Carolina. 

Scruby, Ralph E., Ph.D. (University of Colorado), Senior Research Chemist, EZ. I. Du- 
Pont de Nemours and Co., Benger Laboratory, Waynesboro, Virginia. 

Singer, Sidney, B.A.Sc. (University of Toronto), Graduate Student in Operations Re- 





252 NEWS AND NOTICES 


search, Johns Hopkins University, % Industrial Engineering Dept., Baltimore 18, Mary- 
land. 

Singh, Darogak, M.A. (University of Allahabad), Senior Research Statistician, Institute 
of Agricultural Research Statistics, Indian Council of Agricultural Research, New Pusa, 
library Avenue, New Delhi, Irdia. 

Sturtevant, Charles H., A.B. (San Diego State College), Mathematician, Navy Elee- 
tronics Laboratory, San Diego 52, California; 2226 Cardinal Drive, San Diego 11, Cali 
fornia. 

Sugimura, Masahiko, B.S. (University of Osaka), Assistant Professor, Department of 
Mathematics, Kumamoto Women’s University, Ooemachi, Kumamoto City, Japan. 

Switlyk, George, M.S. (Stevens Graduate School), Statistician, Mathematics and Siatistics 
Group, Engineering Department, E. 1. duPont deNemours and Co., Wilmington 98, Dela 
ware. 

Trawinski, Benon Joha, B.Sc. (McMaster University), Research Assistant, Dept. of Stat., 
Virginia Polytechnic Institute, Blacksburg, Virginia. 

Urbanik, Kazimierz, Ph.D. (Wroclaw University), Assistant Professor, Institute of Mathe 
matics, Wroclaw University; Institute of Mathematics, Polish Academy of Sciences; 
Spoldzielcza 22 m. 3, Wroclaw 12, Poland. 

Uzendoski, Alberta, Ph.D. (St. Louis University), Assistant Professor and Chairman, 
Department of Mathematics, Lourdes Junior College, 68382 Convent Blud., Sylvania, Ohio. 

Vanamamalai, Seshadri, M.A. (University of Madras), Graduate Assistant, Department 
of Mathematics, Oklahoma State University, Stillwater, Oklahoma; Statistical Labora 
tory, Oklahoma State University, Stillwater, Oklahoma 

Vere-Jones, David, M.Sc. (University of New Zealand), Research Student, Oxford Univer 
sity, Magdalen College, Oxford, England. 

Wadsworth, Goorge P., Ph.D. (Massachusetts Institute of Technology), Associate Profes 
sor of Mathematics, Department of Mathematics, Room 2-367A, Cambridge 39, Massa 
chusetts. 

Webb, Kenneth W., B.A. (George Washington University), Statistician, Corporation for 
Economic and Industrial Research, 734 Fifteenth Street, N. W. Washington 6, D.C. 
Weeks, David L., Ph.D. (Oklahoma State University), Assistant Professor of Mathematics, 
Oklahoma State University; Statistical Laboratory, Oklahoma State University, Still 

water, Oklahoma. 

Wegmuller, W., Ph.D., Full professor of Mathematical Statistics, University of Berne, 
Aegertenstrasse 1, Bern, Switzerland. 

Weibull, Christer Felix, Fil.Lic. (University of Lund), Fil.Lic., Institute of Statistics, 
University of Lund, Sweden; Bokersgatan 1, Gotesborg S, Sweden. 

Whittle, Peter, Fil.Dr. (Uppsala University), University Lecturer, Statistical Laboratory, 
Cambridge University, Statistical Laboratory, % University Chemical Laboratory, Lens- 
field Road, Cambridge, England 

Willke, Thomas A., M.S. (Ohio State University), Assistant Instructor, Department of 
Mathematics, Ohio State University, Columbus 10, Ohio. 

Wind, Warren A., B.A., (Luther College), Statistician, Frank N. Magid Associates, 424-425 
Guaranty Building, Cedar Rapids, Iowa. 

Wood, Ernest L., Chartered Life Underwriter (American College of Life Und.), Assistant 
Controller, John Hancock Mutual Life Insurance Company, 200 Berkeley Street, Boston, 
Massachusetts. 

Zackrisson, Uno, Fil.Lic. (University of Gothenburg), Post-Graduate student, Statistical 
Institute, University of Gothenburg, Virrelvindsgatan 16 B, Gothenburg H, Sweden. 

Zubrzycki, Stefan, Ph.D. (Mathematical Institute of the Polish Academy of Science), 
Docent, Mathematical Institute of the Polish Academy of Science, Wroclaw u. Kateo- 
ralna nr 9, Poland 





NEWS AND NOTICES 


SUMMER OFFERINGS IN STATISTICS AT IOWA STATE 
UNIVERSITY 


The Department of Statistics at Iowa State University will offer six applied 
courses in statistical theory and methods in its two 1960 summer sessions. 
These courses are planned primarily for graduate students or research workers 
with limited mathematical backgrounds who wish to use statistical techniques 
intelligently for application to other fields. In addition, courses in special topics 
in theoretical or applied statistics may be studied at the graduate level. Senior 
staff members will be available during most of the summer for consultations on 
research or special problems. 

Students may register for either or both of the six-week summer sessions 
June 6—July 13 and July 13—-August 19. The complete list of statistics offerings 
for the first session is as follows: Stat. 401, “Statistical Methods for Research 
Workers” (at the level of Snedecor’s Statistical Methods); Stat. 447, “Statistical 
Theory for Research Workers” (mainly theory of experimental statistics at the 
level of Anderson and Bancroft’s “Statistical Theory in Research’’; Stat. 599, 
“Special Topies;” and Stat. 699, “Research.”’ In the second session will be 
offered Stat. 402, a continuation of 401; Stat. 448, a continuation of 447; two 
courses in applied methods which are more specialized; Stat. 411, “Experi- 
mental Designs for Research Workers,’’ and Stat. 421, “Survey Designs for 
Research Workers;”’ and finally Stat. 599 and 699. Additional information may 
be obtained from T. A. Bancroft, Department Head and Director, Statistical 
Laboratory, Iowa State University. 


Or 
AMS Summer Institute on Finite Groups 


With the support of the National Science Foundation, a Suramer Institute 
on Finite Groups will be held at the California Institute of Technology in 
Pasadena from August | to August 28, 1960. 

Two distinguished group theorists have been invited from abroad and have 
accepted our invitation. These are Professor Graham Higman of Oxford Uni- 
versity and Professor Helmut Wielandt of the University of Tiibingen. 

Participation in the Summer Institute will be by invitation, but the lectures 
and seminars will be open to interested mathematicians who happen to be in 
the area. 

The Program Committee consists of Professors Richard Brauer, Richard 
Bruck, H. 8. M. Coxeter, Robert Dilworth, Herbert Ryser, and Marshall Hall, 
Jr., Chairman. 


I 


AMS Summer Seminar 


This will be the second in the series of Summer Seminars on applied mathe- 
matics and mathematical physics which are being conducted by the American 





254 NEWS AND NOTICES 


Mathematical Society with the co-operation of the University of Colorado, 
Boulder, Colorado. The first Summer Seminar in the series was held in 1957. 
The Seminar will be sponsored by the Atomic Energy Commission, the Na- 
tional Science Foundation, the Office of Naval Research, and the Office of 
Ordnance Research, U. 8. Army, and will be held from July 24 to August 19, 
1960. 

The purpose of the Seminar is primarily instructional, with emphasis on a 
number of carefully prepared basic courses. Lectures on selected topics given by 
outstanding scientists are an added feature. The Seminar is planned to give 
mature mathematicians the opportunity to hear from leading physicists about 
physical theories developed during recent years and to acquaint them with 
relevant mathematical notions and methods. 

Application blanks for admission to the Seminar or for financial assistance 
should be directed to: Professor K. O. Friedrichs, Chairman of the Organizing 
Committee, New York University, 25 Waverly Place, New York 3, New York 


rR 


OPERATIONS RESEARCH AND SYSTEMS ENGINEERING 


The School of Engineering of the Johns Hopkins University is again offering 
an intensive course in Operations Research and Systems Engineering for busi- 
ness, industrial and government personnel. The course will take place in Balti- 
more from June 6 through June 17, 1960 at the Homewood Campus of the 
University. 


The course will include studies and expository lectures on Operations Re- 
search, Systems Engineering, Cost Data, Models, Human Engineering, Com- 
putor Programming, Simulation, Information Theory, Quality Control, Design 
of Experiments, Game Theory, Flow Graphs, System Dynamics, Inventory 
Systems, Waiting Lines, Symbolic Logic, Stability and Linear Programming. 

For further information, write to: Dean, School of Engineering, The Johns 
Hopkins University, Baltimore 18, Maryland. 


NEW TITLE FOR MTAC 


In January 1960 the title of Mathematical Tables and Other Aids to Computa- 
tion was changed to Mathematics of Computation. The new name reflects the 
broadened scope of the journal, which has expanded to meet the need in this 
country for a publication devoted to numerical analysis and computation. 

The change in name to Mathematics of Computation in no way represents a 
diminished interest in mathematical tables, which will continue to be given 
prime emphasis as in the past. It recognizes an increased interest in other areas 
in the field of mathematics of computation, which have grown in importance 
and in which rapid advances are now being made. The subscribers will find 
future issues of the journal to be a continuation of Mathematical Tables and 
Other Aids io Computation with similar style, format and character of contents, 





NEWS AND NOTICES 255 


and increased coverage of modern advances in the theory and application of 
computational methods. 

Mathematics of Computation will continue to be published quarterly and the 
subscription rate will remain at $8.00 per year. Send new and renewal orders to 
Printing & Publishing Office, National Academy of Sciences, National Research 
Council, 2101 Constitution Avenue, Washington 25, D. C. 


eR a 


TRANSLATIONS OF RUSSIAN JOURNALS 
Theory of Probability and its Applications 


The Society for Industrial and Applied Mathematics announces the appearance 
of the first issue of Volume IV of Theory of Probability and Its Applications. 
This is a complete translation into English of the corresponding issue of the 
Russian journal Teortya Veroyatnostei i ee Primeneniya. During 1960 the 
Society will publish separately translations of all four issues of Volume IV 
(1959), will begin the translation of Volume V, and will publish in bound form 
full translations of Volumes I (1956), II (1957) and III (1958). It is expected 
that by early 1961 translations will be appearing within four months of publi- 
cation of the Russian original. 

Theory of Probability and Its Applications is a quarterly journal devoted, as 
its title indicates, to research papers in probability and statistics, and to related 
applications in physics and communication. For the most part, the journal has 
reported the work of Russian authors; its translation offers the first access to 
this material in a Western language. 

The Society for Industrial and Applied Mathematics has been assisted in 
this translation project by a grant from the National Science Foundation. 

Subscriptions to Theory of Probability and Its Applications are being offered 
at $18.00 for four current issues (one year) ($9.50 to members of the Society, 
add $3.00 for subscriptions outside of the U. 8. and Canada). Inquiries may be 
addressed to the Society at Box 7541, Philadelphia 1, Pennsylvania. 


Soviet M athematics— Doklady 


This new American Mathematical Society journal, published under the 
sponsorship of the National Science Foundation, will contain translations of 
the entire Pure Mathematics section of the Doklady Akademii Nauk SSSR, 
the Reports of the Academy of Sciences of the USSR. 

The Doklady Akademii Nauk SSSR presents short articles, averaging three 
to five pages, covering results of current research in Pure and Applied Mathe- 
matics, Physics, Astronomy, Chemistry, Geology, Botany, Zoology, etc. Only 
the Pure Mathematics section will be translated by the American Mathematical 
Society; other sections are being translated by other agencies. 

All branches of Pure Mathematics are covered in the Doklady in short articles, 
similar to those in the Research Announcements section of the Bulletin of the 





256 NEWS AND NOTICES 


American Mathematical Society and in Comptes Rendus, Paris, which usually 
do not give proofs. Translation of these articles will provide a comprehensive, 
up-to-date survey of what is going on in Soviet mathematics, enabling American 
mathematicians to keep abreast of current developments in the USSR. 

The Pure Mathematics section of the Doklady is expected to total about 1,500 
pages in 1960. Soviet Mathematics—Doklady will be issued six times a year, the 
first issue to appear early in 1960. Rates are as follows: Domestic subscriptions, 
$17.50; foreign subscriptions, $20.00; single issues, $5.00. Orders may be placed 
with the American Mathematical Society, 190 Hope Street, Providence 6, 
Rhode Island. 


a 


VISITING FOREIGN MATHEMATICIANS 


The following list (dated October 12, 1959) of visiting foreign mathematicians 
has been received from the Division of Mithematics, National Academy of 
Sciences, National Research Council. The information given is, in order, the 
name, home country, host institution, and period of visit; AY stands for academic 
year, Ind. stands for indefinite. The names of persons whose visit terminates 
before March, 1960, have not been included. ABuavuap ABugsatim, Crsar, Chile, 
University of Chicago, Sept. 1959-June 1960; ALVAREz pe Araya, JorGe, Chile, 
University of Washington, Sept. 1958-Aug. 1960; Amirsur, SHimsHon A., 
Israel, Yale University, Fall Term 1959-Ind.; Araki, SHOrd, Japan, Institute 
for Advanced Study, Sept. 1959-April 1960; AspLuNb, O. EpGar, Sweden, 
Institute for Advanced Study, Sept. 1958—April 1960; AuBert, Karu E., Nor- 
way, Institute for Advanced Study, Sept. 1958-April 1960; BaNascHEewsk1, 
BERNHARD, Germany, Tulane University, AY 1959-60; Baumstac, GILBERT, 
England, Princeton University, Sept. 1959-June 1960; Bernays, Pau., Switzer- 
land, Institute for Advanced Study, Nov. 1959-April 1960; Berestrrém, H.., 
Sweden, Catholic University, Sept. 1960-July 1961; Besicovircn, A. 8., Eng- 
land, University of Pennsylvania, Sept. 1958—June 1961; Buatrracuaryya, B. 
B., India, North Carolina State College, Raleigh, Oct. 1, 1959-Ind.; Bo zr, 
Bruce A., Australia, Lamont Geological Observatory, Columbia University, 
Jan. 1960—Oct. 1960; CHakravarti, I. M., India, University of North Carolina, 
Sept. 1959-Aug. 1960; Curistian, ULricu, Germany, Institute for Advanced 
Study, Sept. 1958—April 1960; CLunie, J. G., U. K., Massachusetts Institute 
of Technology, Sept. 16, 1959—June 15, 1960; ConsTantTiInE, ALAN, Australia, 
Yale University, Fall Term 1959-Ind.; Demazure, Micue., France, Princeton 
University, Sept. 1959—June 1960; DeHEuVELS, Ren#, France, Yale University, 
Fall Term 1959-Ind.; Devrinc, Max F., Germany, Institute for Advanced 
Study, Sept. 1959-April 1960; Domprowsk1, Perer L., Germany, Massachu- 
setts Institute of Technology, Sept. 15, 1959—June 15, 1960; Dueu#, Daniet, 
France, Catholic University, Mar. 1, 1960—July 1, 1960; Dursin, James, U. K., 
University of North Carolina, July 1959—June 1960; Ercker, Freperick, 





NEWS AND NOTICES 257 


Germany, University of North Carolina, March 1959-Ind.; Euston, R. C., 
U. K., University of North Carolina, July 1959-June 1960; Enceter, Erwin, 
Switzerland, University of Minnesota, 1958-1960; Foeur., Suavut R., Israel, 
University of California, Berkeley, Sept. 1958—June 1960; Fusisaxi, G., Japan, 
Massachusetts Institute of Technology, Sept. 1, 1959-Aug. 31, 1960; FuLLERToN, 
G. H., N. Ireland, Massachusetts Institute of Technology, Sept. 15, 1959-June 
15, 1960; Gopement, Roger J., France, University of California, Berkeley, 
AY 1959-60; Gores, GunTHer, Germany, Northwestern University, Sept. 1, 
1959-Sept. 1, 1960; Grippen, Ronaup J., England, Massachusetts Institute of 
Technology, Sept. 16, 1958-June 15, 1960; Gritneaum, Branko, Israel, In- 
stitute for Advanced Study, Sept. 1958-April 1960; Gurwirtu, Azrret, Israel, 
University of California, Berkeley, Sept. 15, 1959—Aug. 1, 1960; Ha, Kwano 
Cuut, Korea, University of North Carolina, Sept. 15, 1958-June 30, 1960; 
HAEFLIGER, ANDRE, Switzerland, Institute for Advanced Study, Sept. 1959- 
April 1960; Hanan, Harm, Israel, University of Wisconsin, Mathematics Re- 
search Center (Army), Sept. 1959-Aug. 1960; Hannan, Epwarp J., Australia, 
University of North Carolina, Oct. 1959-May 1960; Hwary, M. J. R., England, 
Bell Telephone Laboratories, Oct. 1959-Ind.; HetGason, Sicgurpur, Iceland, 
Columbia University, Sept. 1959-June 1960; Hirzesprucn, Frrepricn, Ger- 
many, Institute for Advanced Study, Sept. 1959-April 1960; Ieusa, Jun-1cut, 
Japan, Institute for Advanced Study, Sept. 1959-April 1960; lonzscu, Cassius 
I., Rumania, Yale University, 1957-Ind.; Jenkins, Gwitym M., U. K., Stanford 
University, Sept. 1959-Sept. 1960; Kampf pe Fériet, Josern, France, David 
Taylor Model Basin (Navy), and Harvard University, March-April 1960; 
KatzNetson, Y., Israel, University of California, Berkeley, AY 1959-60; 
Krnosuita, Surn’icut, Japan, Institute for Advanced Study, Sept. 1959—April 
1960; Kocak, Cevpet, Turkey, University of Maryland, Sept. 1959-July 1960; 
Kostnski, A. A., Poland, University of California, Berkeley, AY 1959-60; 
Kurepa, Druro, Yugoslavia, Institute for Advanced Study, Fall Semester 1959, 
University of Colorado, Spring Semester 1960, Sept. 1959-June 1960; Kurepa, 
Sverozan, Yugoslavia, Tulane University, Second Semester 1959-60, and 
University of Maryland, Feb. 1960 to June 1961, 1959—June 1961; Lana, RapHa 
G., India, Catholic University, Sept. 1957-Sept. 1960; Lampex, Joacuim, 
Canada, Institute for Advanced Study, Sept. 1959-April 1960; Lanczos, C.., 
Ireland, Mathematics Research Center (Army), University of Wisconsin, Oct. 
1959-March 1960; Lee, Suinc, Men, Formosa, Northwestern University, 
Sept. 1, 1959-Sept. 1, 1960; Levy, Azrrex, Israel, Massachusetts Institute of 
Technology, Sept. 16, 1958-June 15, 1959, University of California, Berkeley, 
AY 1959-60, Sept. 1958-June 1960; Los, Jerzy, Poland, University of Cali- 
fornia, Berkeley, Oct. 1, 1959-Aug. 1, 1960; Masant, Pest R., India, Brown 
University, Sept. 1959-June 1961; Mcmanon, James J., Ireland, Fordham 
University, Sept. 1959—June 1960; Micuaen, Davin H., U. K., Harvard Uni- 
versity, Sept. 1959-July 1960; Mrrsu1, T., Japan, Massachusetts Institute of 
Technology, Sept. 1, 1959-Aug. 31, 1960; Monnet, L. J., U. K., University of 





258 NEWS AND NOTICES 


Colorado, Sept. 1959-June 1960; Morikawa, Hisasi, Japan, Institute for 
Advanced Study, Sept. 1958-April 1960; Mrowka, 8., Poland, University of 
Michigan, Sept. 1959-June 1960; Murre, Jacos P., Netherlands, Northwestern 
University, Sept. 1, 1959-Sept. 1, 1960; Nacata, Jun-rr1, Japan, University 
of Washington, Sept. 16, 1959-June 15, 1960; Naim, Linpa, Israel, University 
of California, Berkeley, Spring Term 1960; Nakaoxa, Minoru, Japan, In- 
stitute for Advanced Study, Sept. 1958—April 1960; Nféron, Anpr#, France, 
Institute for Advanced Study, Sept. 1959-April 1960; Nigro, José, Colombia, 
University of Maryland, June 1960—-June 1961; Opata, Moro, Japan, Uni- 
versity of Illinois, Sept. 1958—Aug. 1960; Onesto, NELLO, Italy, Massachusetts 
Institute of Technology, Oct. 1959-July 1960; Ono, Takasut, Japan, Institute 
for Advanced Study, Sept. 1959-April 1960; Ostrowski, ALEXANDER M., 
Switzerland, American University and National Bureau of Standards, Sept. 2, 
1958-Feb. 1959; Mathematics Research Center (Army), University of Wiscon- 
sin, Oct. 1959-June 1960, Sept. 1958-June 1960; PaPpAKYRIAKOPOULOS, CHRISTO, 
Greece, Princeton University, Sept. 1959-June 1960; Park, Samuen, Korea, 
Columbia University, Sept. 1959-June 1960; Pepersen, FLemmine P., Den- 
mark, University of Southern California, July 1958-June 1960; Poncer, JEAN, 
Switzerland, Institute for Advanced Study, Sept. 1959—April 1960; RaMANUJAN, 
M. 8., India, University of Michigan, Sept. 1956-June 1961; Rastowa, H., 
Poland, University of Chicago, Oct. 1, 1959-Mar. 31, 1960; Ree, Rimuak, 
Korea, Columbia University, Sept. 1959-June 1960; Reis, F. B., Brazil, Uni- 
versity of Maryland, Sept. 1, 1959-June 30, 1960; Remmert, Retnuovp, Ger- 
many, Institute for Advanced Study, Sept. 1959-April 1960; RipenBorm, 
Pau .o, Brazil, University of Illinois, Sept. 1959-June 1960; Rim, Dock Sana, 
Korea, Columbia University, Sept. 1959—June 1960; Ropertson, ALEx, Scot- 
land, University of Kansas, Sept. 1, 1959-May 31, 1960, In U.S. summer 1960; 
Ropertson, Wenpy, Scotland, University of Kansas, Sept. 1, 1959-May 31, 
1960, In U. S. summer 1960; Roy, JoGaprata, India, University of North 
Carolina, Oct. 1959-Sept. 1961; Sakurag, A., Japan, Institute of Mathematical 
Sciences, New York University, Sept. 1958—Aug. 1959, Massachusetts Institute 
of Technology, Sept. 1959-Sept. 1960, Sept. 1958-Sept. 1960; SamvueL, Prerre, 
France, University of Illinois, Sept. 1959—- June 1960; Satake, Icurro, Japan, 
Institute for Advanced Study, Sept. 1958—April 1960; Saw, Joun G., U. K., 
University of North Carolina, Aug. 1959-May 1960; Scnaerz, R. L., Germany, 
University of Michigan, Sept. 1959—June 1960; Scuiirre, Kurt, Germany, 
Institute for Advanced Study, Sept. 1959-April 1960; Suan, 8S. M., India, 
University of Wisconsin, Fall Semester I, 1958-59; Northwestern University, 
Sept. 1, 1959-Sept. 1, 1960; Sept. 1958-Sept. 1960; SHarku, MonamMmap Yusvr, 
Pakistan, University of Tennessee, Sept. 1, 1959-Aug. 31, 1960; Samrat, M., 
Israel, University of Michigan, Sept. 1959-June 1960; Sineu, 8S. K., India, 
University of Kansas, Sept. 1, 1958—-May 31, 1960; Skoveaarp, H., Denmark, 
California Institute of Technology, Jan. 1, 1959—June 30, 1960; Sraa., R. A., 
Canada, University of California, Berkeley, Academic Year 1959-60; Sunovcut, 





NEWS AND NOTICES 259 


Genicuiro, Japan, Northwestern University, Sept. 1, 1959-Aug. 31, 1960; 
SzmieLew, WaANbDA, Poland, University of California, Berkeley, Spring Term 
1960; Takeutr, Gatsi, Japan, Institute for Advanced Study, Sept. 1959-April 
1960; TempLe, Georce F. J., U. K., Institute for Advanced Study, Sept.—Dec. 
1959; Applied Mathematics Laboratory, David Taylor Model Basin (Navy), 
Jan. 1960—Mar. 31, 1960; Sept. 1959-March 1960; THoma, Etmar, Germany, 
University of Washington, Sept. 16, 1959-June 15, 1960; Topa, Hrrosi, Japan, 
Institute for Advanced Study, Sept. 1959-April 1960; Toco, S., Japan, North- 
western University, June 1, 1958—Mar. 31, 1960; Van, Pinc-Cuane, Taiwan, 
Texas Technological College, Sept. 1959-June 1960; Van pe Ven, Ton, Nether- 
lands, Institute for Advanced Study, Sept. 1959-April 1960; Vararapsan, D. 
S., India, Princeton University, Nov. 1, 1959-Aug. 31, 1960; Waker, A. G., 
U. K., University of Washington, Sept. 16, 1959—June 15, 1960; WaALLace, 
Davin, U. K., Princeton University; and Harvard University, Sept. 1958-—July 
1960; Waters, Kenneru, U. K., Brown University, Sept. 1959-July 1960; 
Weston, Jerrrey D., U. K., California Institute of Technology, Jan. 1, 1960- 
June 30, 1960; Wurreneap, J. H. C., U. K., University of Chicago, Fall Term, 
Oct. 1959—Dec. 1959; Institute for Advanced Study, Spring Term, Jan.—April 
1960, Oct. 1959-April 1960; Youne, Evriquio C., Philippines, University of 
Maryland, Sept. 1958—Aug. 1960. 


es 


REPORT OF THE WASHINGTON, D. C. MEETING OF THE 
INSTITUTE OF MATHEMATICAL STATISTICS 


The eighty-second meeting of the Institute of Mathematical Statistics, the 
twenty-second annual meeting, was held in Washington, D. C., on December 
27-30, 1959 in conjunction with meetings of the American Statistical Associa- 
tion and the Biometric Society (ENAR). 

There were 492 members of the Institute registered for the meeting. The 
program of the meeting was as follows: 


SUNDAY, DECEMBER 27, 1959 
10:00 a.m.—Contributed Papers I 


Chairman: 8. 8. Gupta, Bell Telephone Laboratories. 
1. “‘A ‘Renewal’ Limit Theorem for General Stochastic Processes,’ V. E. Benés, Bell 
Telephone Laboratories and Dartmouth College. 
2. “Examples of Two Independent Separable Processes whose Sum is Not Separable,’’ T. 
Ferouson, Princeton University. 
3. “First Emptiness of Two Dams in Parallel,’’ J. M. Gant, Columbia University. 
4. “Optimum Decision Procedures for a Poisson-Process Parameter,’ J. A. Lecuner, Uni- 
versity of Maryland. 
“Inference in Stochastic Processes I: Testing Composite Hypotheses’’ (Preliminary 
Report), M. M. Rao, Carnegie Institute of Technology. 
‘Testing of Hypotheses on Categorical Data,’’ 8. N. Roy, University of North Carolina 
and V. P. Buarxar, University of North Carolina and University of Poona, India. 





260 NEWS AND NOTICES 


7. “Some Nonparametric Problems: I,’ V. P. Buarkar, University of North Carolina 
and University of Poona, India. (By title) 

. “A Rank Sum Test for Comparing all Pairs of Treatments,’ R. G. D. Sreer, Cornell 
University. 

. ‘Mathematical Models for Ranking from Paired Comparisons,’ H. D. Brunx, Univer 
sity of Missouri. (By title) 

. “Some Nonparametric Problems: I],’’ V. P. Buarxar, University of North Carolina 
and University of Poona, India. (By title) 

. “Stochastic Approximation and ‘Minimax’ Problems,’’ L. A. Garpner, Jr., MIT Lin 
coln Laboratory. (By title) 


10:00 a.m.—Contributed Papers II 


Chairman: Jack Napier, Bell Telephone Laboratories. 
1. ‘On the Problem of Negative Estimates of Variance Components,’’ W. A. THompson, Jr., 
University of Delaware. 
2. “On the Exactness of the Missing Plot Procedure in a Randomized Block Design,’ J. L. 
Fouxs, Texas Instruments Incorporated. 
3. “Some Randomization Consequences in Balanced Incomplete Blocks,’ G. Zysx1np, Uni 
versity of North Carolina and Iowa State University. 
4. ‘“‘Main-Effeci Designs for Asymmetrical Factorial Experiments,’ 8. AppeELMAN, Iowa 
State University. 
“On the Efficiency of Experimental Designs,’’ 8. N. Roy, 8. 8. SurikHanpe, anp P. R. 
KRISHNAIAH, University of North Carolina. 
». “Reduction of Multiple Regression Systems by Use of Direct Products of Matrices,”’ 
J. Ligsiern, David Taylor Model Basin. 
. “Small Sample Behavior of Estimators of Parameters in a Linear Functional Relation 
ship,’’ M. Dorrr anv J. GurLaNpb, Lowa Strate UNIVeRsITyY. 
. “Remarks on ‘Standard Coefficients’ in Normal Regression Analysis,’’ P. R. KRisHNAIAH 
AND M. M. Rao, University of Minnesota. (By title) 


1:00 p.m.—Multivariate Analysis I (ASA, 3S, and IMS) 


Chairman: 8. 8. WiLks, Princeton University 
1. ‘‘Multivariate Chebyshev Type Inequalities,’ A. W. MarsHauu, Stanford University. 
2. ‘‘Some Applications of Multivariate Chebyshev Inequalities,’ 1. OLKin, Michigan State 
University. 
3. ‘‘A Representation of the Wishart Matriz,’”’ R. A. Wissman, University of Illinois 


2:00 p.m.—-Wald Lecture I 


Chairman: K. L. Cuune, Syracuse University 
‘*Balanced and Group Divisible Designs,’’ R. C. Bose, University of North Carolina. 


3:00 p.m.—-Special Invited Address 


Chairman: H. Cuernorr, Stanford University 
‘““Recent Work in the Theory of Dams,’’ W. L. Smitn, University of North Carolina 


4:00 p.m.—Invited Papers I 


Chairman: P. P. Brittinecsitey, University of Chicago 
1. “On a Problem of Queues with Single Server,’’ L. Takacs, Columbia University. 
2. ‘Recent Developments in Inventory Theory,’ H. Scarr, Stanford University. 


« 


3. ‘“Semi-Markov Processes—Denumerable State Space,’’ R. Pyke, Columbia University. 





NEWS AND NOTICES 


4:00 p.m.—Contributed Papers III 


Chairman: J. F. Pauis; Smith, Kline and French Laboratories 
1. “On Time Series Analysis and Reproducing Kernel Space,’ N. D. Yuvisaxer, Columbia 
University. (By title) 
2. ‘‘Asymptotic Expansions for the Mean and Variance of the Serial Correlation Coefficient,”’ 
J. 8. Wurre, Minneapolis Honeywell Regulator Company. 
“On the Distribution of .a Nonecircular Serial Correlation Coefficient with Lag 1 When 
the Mean of the Observations is Unknown,”’ F. Ercxwr, University of North Carolina. 
“On the Distribution of the Sum of Circular Serial Correlation Coefficients and the Effect 
of Non-Normality on its Distribution,’ V.K. Murtuy, University of North Carolina. 
5. “‘On the Monotonic Character of the Power Functions of Two Multivariate Tests,’’ 8. N 
Roy anp W. F. Mixnait, University of North Carolina. 
“On Tests of Certain Types of Hypotheses Involving the Dispersion Matrices of Two or 
More Multivariate Normal Distributions and the Associated Confidence Bounds.”’ 
8. N. Roy, University of North Carolina anp R. GNANADESIKAN, Bell Telephone 
Laboratories 
“On the Determinants and Characteristic Equations of a Class of Patterned Matrices,’ 
8S. N. Roy, B. G. Greenperc anp A. E. Sarnan, University of North Carolina. 
. “Further Results on Hypothesis of No Interaction in Multidimensional Tables,’ P. R 
KrisHNnatan AND V. K. Murtuy, University of North Carolina. (By title) 


‘*A Solution of the Classification Problem,’ 8. N. Roy, University of North Carolina. 
(By title) 


8:00 p.m.—1959 Council Meeting 


President: J. Wourowitz, Cornell University. 


MONDAY, DECEMBER 28, 1959 
9:00 a.m.— Wald Lecture II 


Chairman: W. G. Cocuran, Harvard University. “Orthogonal Squares and Orthogonal 
Arrays,’ R. C. Bose, University of North Carolina. 


10:30 a.m.—Invited Papers I 


Chairman: D. B. Duncan, University of North Carolina. 
1. ‘Multiple Decision Rules Which Control All Kinds of Errors,”’ W. J. Har., University 
of North Carolina. 
2. ‘‘Remarks about a Paired Comparion Model,’’ G. Noeruer, Boston University. 
3. ‘‘Rank Tests of Locally Most Powerful Type,’ H. Uzawa, Stanford University. 


10:30 a.m.—Invited Papers III 
Chairman: E. Luxacs, Catholic University. 


1. ‘‘The Spectral Theory of Operators in a Hilbert Space and ita Application to the Deriva- 


tion of Some Statistical Theorems,’ E. J. Hannan, Canberra University College, 
Australia. 


2. “On Explicit Solutions to Problems of Linear Prediction and Regression Analysis of 


Continuous Parameter Time Series,’ E. Parzen, Stanford University. 


10:30 a.m.— Multivariate Analysis Il, (ASA and IMS) 


Chairman: R. L. ANnperson, North Carolina State College. 


1. “Sequential Chi-Square and T? Tests,”’ J. E. Jackson, Eastman Kodak Company anp 
R. A. Brapwey, Florida State University 





262 NEWS AND NOTICES 


2. “‘Multivariate Correlation Models with Mixed Discrete and Continuous Variables,”’ 
R. F. Tate, University of Washington. 

3. ‘Analysis of Multi-Factor Multi-Response Experiments with Mized Factor and Re- 
sponse Types,’’ 8. N. Roy, University of North Carolina. 


2:00 p.m.—Contributed Papers iV 


Chairman: J. Dursin, University of North Carolina 

1. “On the Foundations of the Theory of Testing Hypotheses’ (Preliminary Report), A. 
Birnspaum, New York University. 

2. ‘‘Multiple-Decision Ranking Problems Arising from Factorial Experiments on Variances 
of Normal Populations’’ (Preliminary Report), R. E. Becuuorer, Cornell Univer- 
sity. 

3. ‘‘A Modified Procedure for Group Testing,’’ M. Sonex, Bell Telephone Laboratories 
and New York University. 

. “A Problem in Restrictive Group Testing,’’ M. Sope., Bell Telephone Laboratories 
and New York University anp P. A. Grout, Bell Telephone Laboratories. 

. “A Single Sample Decision Procedure for Selecting a Subset Containing the Best of Sev 
eral Normal Populations and Some Extensions,’ 8. 8. Gupta, Bell Telephone Lab- 
oratories. 

. “Use of Prior Knowledge in Finding the Maximum Response,’ R. J. Bueniter, lowa 
State University. 

. “Partnership Games with Secret Conventions Prohibited,’ M. Fox ann H. Rvustn, 
Michigan State University. 

s. “On a Single Sample Procedure for Selecting from Several Normal Populations a Subset 
Containing the Population with the Smallest Variance,’’ 8. 8. Gupta, Bell Telephone 
Laboratories aNp M. Sose., Bell Telephone Laboratories and New York University. 
(By title) 

. “On the Distribution of the Ratio of the Smallest of Several Chi-Squares to an Independent 
Chi-Square,”’ 8. 8. Gupta, Bell Telephone Laboratories anp M. Sosgt, Bell Tele- 
phone Laboratories and New York University. (By title) 


2:00 p.m.—Recent Developments in Information Theory (in cooperation with 
the Professional Group on Information Theory, Institute of Radio Engineers). 
Chairman: D. Siter1an, Bell Telephone Laboratories. 


1. ‘Information Theory and Statistics,’’ 8. Kutipackx, George Washington University. 
2. “Error Correcting Codes,’’ D. K. Reay-Cuaupuvurt, Case Institute of Technology. 


2:00 p.m.—Classification and Discrimination I, (ASA, BS, and IMS) 


Chairman: H. Sotomon, Stanford University. 

1. “Classification into Multivariate Normal Distributions with Unequal Covariance Mat- 
rices,’’ T. W. ANpERSON, Columbia University anp R. R. Banapwur, Indian Statisti- 
cal Institute. 

2. ‘“‘A Representation for Anderson's W Statistic,’’ A. Bowker, Stanford University. 

3. ‘‘An Asymptotic Expansion for the Distribution Function of Anderson's Classification 
Statistics W,’’ R. Strareaves, Columbia University. 


4:00 p.m.—Contributed Papers V 


Chairman: R. Becunorer, Cornell University. 
1. ‘‘Asymptotically Optimal Stopping Rules in Sequential Analysis’’ (Preliminary Re- 
port), H. Cuernorr, Stanford University. 
2. ‘Unbiased Sequential Estimation for Certain Two Parameter Problems’’ (Preliminary 





NEWS AND NOTICES 263 


Report), B. Brainerp, University of Western Ontario anp I. Cuorneyko, anp T. 
V. Narayana, University of Alberta. 
3. “Minimax Sequential Tests of Some Composite Hypotheses ,’’ M. H. DeGroot, Carnegie 

Institute of Technology. 

. “Optimum Properties and Admissibility of Sequential Tests,’’ D. L. BurKHOLDER AND 
R. A. Wissman, University of Illinois. 

. “Distribution of Sample Size in Sequential Sampling,” L. L. Lasman, Florida State 
University anp E. J. Witu1ams, North Carolina State College. 

. “Power Characteristics of the Control Chart for Number of Defects, No Standard Given,’’ 
E. B. McCue, Ohio University. 

. “Confidence Bounds for an Integral Function of an Estimate with Applications to Re- 
liability Theory,’’ 8. C. Saunpers, Boeing Scientific Research Laboratories. 

. “The Method of Moments Applied to a Mixture of Two Exponential Distributions,”’ P. 
R. Riper, Wright-Patterson Air Force Base. 


4:00 p.m.—The Organization of Statistical Instruction in Colleges and Uni- 
versities, (ASA and IMS) 


Chairman: E. Parzen, Stanford University. 
Panel : T. A. Bancrort, Iowa State University. 
J. Neyrman, University of California. 
G. E. Nicnouson, Jr., University of North Carolina. 
8. S. Wigs, Princeton University. 
Discussion: F. Mosre tier, Harvard University. 
H. So.omon, Stanford University. 


4:00 p.m.—Classification and Discrimination II (ASA, BS, and IMS) 


Chairman: 8. W. Greennovuse, National Institutes of Health. 

1. ‘Some Problems in Multivariate Classification Using Discrete Variates,’’ W.G. Cocuran, 
Harvard University anp C. E. Horxins, University of Oregon. 

2. ‘A Descriptive Application of Discriminant Funcations in Physical Anthropology,”’ 
M. J. R. Heaty, Bell Telephone Laboratories and Rothamsted Experimental Sta- 
tion. 

3. ‘Linear Multivariate Models as Representative of Clinical Judgment,” P. J. Horrman, 
University of Oregon. 


4:00 p.m.—Contributed Papers VIII 


Chairman: H. Nisseison, Bureau of the Census. 

1. ‘“‘A Simplified Method for finding Confidence Limits on the Relative Risk in 2 X 2 Tables,” 
J. J. Gant, The Johns Hopkins University. (By title) 

2. “Stationary Probabilities for a Semi-Markov Process with Finitely Many States,’ R. 
Pyke, Columbia University. 

3. “On Estimating the Mean of a Finite Population,”’ J. Roy anv I. M. CHakravarti, 
Indian Statistical Institute and University of North Carolina. Introduced by R. 
C. Bose. 

. “On the Probability of Detection of Noise-Like Signals ,’’ W. M. Stone, Boeing Airplane 
Co. and Oregon State College, anp K. J. Hammer.e, Boeing Airplane Co. Intro 
duced by J. B. Tysver. 

. “Identifiability of Miztures,’’ H. Texcuer, Purdue University. 

. “Generalization of a Theorem of Polya, and Applications,’ R. R. Rao, Indian Statistical 
Institute. Introduced by R. R. Bahadur. (By title) 

. “When to stop,’’ H. Rossins, Columbia University. (By title) 





264 NEWS AND NOTICES 


6:00 p.m.—Business Meeting 


President: J. Wo.irow1rz, Cornell University. 


8:00 p.m.—1960 Council Meeting 


President: J. W. Tuxey, Princeton University and Bell Telephone Laboratories. 


TUESDAY, DECEMBER 29, 1959 
9:00 a.m.—Wald Lecture III 


Chairman: W. HoerrpinG, University of North Carolina 
‘Factorial Designs—Confounding and Fractional Replication,’’ R. C. Boss, University 
of North Carolina 


10:30 a.m.—Invited Papers IV 


Chairman: M. Sose., Bell Telephone Laboratories and New York University. 
1. “Construction of Optimum Experimental Designs,’’ J. Kierer, Cornell University. 
2. ‘‘Generalized Bayes Solutions in Estimation Problems,’’ J. Sacks, Columbia University. 
3. “On a Unified Theory of Estimation,’ A. Birnpaum, New York University. 


10:30 a.m.—Mathematical Models in the Life Sciences, (ASA, BS, and IMS) 


Chairman: H. F. Bricut, George Washington University. 
1. “The Two-Stage Mutation Model for Carcinogenesis and Experimental Means of its 
Verification,’ J. NEYMAN AND E. L. Scorrt, University of California. 
2. “On the Relation Between the Structural Geometry and the Function of Individual Neu 
rons,’’ W. Rat, National Institutes of Health. 
3. “Competing Risks and the Follow-up Study,’ C. L. Cutane, University of California. 


2:00 p.m.—-Rietz Lecture 


Chairman: J. Woirow1rz, Cornell University. 
“The Problem of Two Samples from Continuous Distributions,’ 8. 8. WitKs, Princeton 
University. 


3:00 p.m.—Special Invited Address 


Chairman: C. Derman, Columbia University. 
“Sufficiency and Approximate Sufficiency,’’ L. LeCam, University of California. 


4:00 p.m.—Invited Papers V 


Chairman: M. Rosensiatt, Brown University. 
1. “Group Representation Methods in Multivariate Distribution Theory,”’ A. T. James, 
Yale University. 
2. ‘‘Recent Developments in the Convergence of Stochastic Processes,’’ G. KALLIANPUR, 
Michigan State University. 
3. ‘Toeplitz Forms and Generalizations of Bernouilli Trials,’ F. Seirzer, University of 
Minnesota. 


4:00 p.m.—Contributed Papers VI 


Chairman: M. Gurney, Bureau of the Census. 
1. “‘Cross-Compounded Distributions,’ R. A. Epstern ano L. R. Wetcn, California 
Institute of Technology. 





NEWS AND NOTICES 265 


. “Generalizations of Thompson’s Distribution, I1,’’ A. G. Laurent, Wayne State 
University. 
3. “On the Decomposition of Certain Characteristic Functions,’ B. RaAMACHANDRA‘, 
Catholic University. Introduced by E. Lukacs. 
. “Certain Extensions of a Theorem of Marcinkiewicz”’ (Preliminary Report), I. Curisten- 
sen, Catholic University. 
. “On the Characterization of a Family of Populations which Includes the Poisson Popula- 
tion,”’ E. Luxacs, Catholic University. 
. “Generalized Power Series Distribution and Certain Characterization Theorems,”’ G. P. 
Patit, University of Michigan. 
. “On Characterization Problems Connected with Quadratic Regression,’ R. G. Lana anp 
E. Luxacs, Catholic University 
8. “Certain Uncorrelated Statistics,’ R. V. Hoaea, University of Iowa. 


10:00 p.m.—Informal Party (ASA, BS, and IMS) 


WEDNESDAY, DECEMBER 30, 1959 
9:00 a.m.—Wald Lecture IV 


Chairman: E. Parzen, Stanford University. 
“Construction of Error Correcting Codes,’’ R. C. Bose, University of North Carolina. 


10:30 a.m.—Invited Papers VI 


Chairman: R. GNANADESIKAN, Bell Telephone Laboratories. 
i. ‘Moment Generating Functions of Quadratic Forms of Normal Order Statistics,’ H. 
Rusen, Columbia University. 
2. ‘“‘Some Nonparametric Tests in Multivariate Problems,’’ L. Weiss, Cornell University. 
3. “The Actual Distribution of One Sample Kolmogorov-Smirnov Statistics,’’ M. Dwass, 
Northwestern University. 


10:30 a.m.—Final Report of the Advisory Committee on Weather Control 


Chairman: A. H. Bowker, Stanford University. 
1. “Criticism of the Final Report of the Advisory Committee on Weather Control,’’ J. Ney- 
MAN, University of California. 
2. ‘Bias in the Evaluation of Cloud Seeding Operations Introduced by the Transformation 
of Variabies,’’ E. L. Scorr, University of California. 
3. “Problems of Experimental Desigr.,”’ J. Youpen, National Bureau of Standards. 
4. “‘Stochastic Theory of Precipitation,’’ L. LeCam, University of California. 


10:30 a.m.—Contributed Papers VII 


Chairman: D. Rupinstern, General Electric Co. 

1. “The Estimation of the Location of a Discontinuity in Density,’’ H. Rosen, Michigan 
State University. 

2. ‘A Subfield Containing a Sufficient Subfield is Not Necessarily Sufficient,’ D. L. Burx- 
HOLDER, University of Illinois. 

3. “Polya Type Distributions of Convolutions,’”’ 8. Karin, Stanford University anp F. 
Proscuan, Sylvania Electronic Products, Inc. 

. “On Infinite Packing Theorem for Spheres: A New Application of the Borel-Cantelli 

Lemma,’ O. Wesier, University of Michigan. 

5. “Conditional Expectations of Banach-valued Random Variables,’”’ 8. D. Cuatrersi, 
Michigan State University. Introduced by K. J. Arnold. 





NEWS AND NOTICES 


. “Some Asymptotic Results for a Coverage Problem,’’ M. Hatrerin, Knolls Atomic 
Power Laboratory. (By title) 


. “A Law of Large Numbers for Dependent Random Variables,’’ E. Parzen, Stanford 
University. (By title) 
. “A New Inversion Formula,’’ E. Parzen, Stanford University. (By title) 


. “A New Proof of the Continuity Theorem of Probability Theory,’’ E. Parzen, Stanford 
University. (By title) 


ee 


REPORT OF THE SECRETARY FOR 1959 


During the year the Institute has held its 80th through 82nd meetings in 
Pittsburgh, Pennsylvania, Cleveland, Ohio, and Washington, D. C., respec- 
tively. A Business Meeting was called during the 82nd meeting (22nd Annual) 
and subsequently made the necessary decisions for carrying on the work of the 
Institute for the coming year. The Institute is gratified with the presentation 
of the programs at all of its meetings. M. B. Wilk, Program Coordinator, with 
the help of the local Program Chairmen, Franklin Graybill, Boyd Harshbarger, 
and Emanuel Parzan and their committees are to be congratulated. The As- 
sistant Secretaries, Donovan J. Thompson, Fred C. Leone, and Joan Rosenblatt 
should be mentioned also for making very satisfactory arrangements for these 
meetings. Dorothy Gilford and Jack Silber as Associate Secretaries competently 
carried out the duties of the Secretary with respect to meetings. 


Or 


REPORT OF THE EDITOR FOR 1959 


During the operating year August 1, 1958, to July 31, 1959, the number of 
manuscripts submitted to the Annals remained approximately the same as in 
the two preceding years, about 180. The size of the printed volume for the 
calendar year 1959, about 1300 printed pages, has been adequate to keep the 
backlog of accepted but unprinted pages to a negligible size, virtually zero. 
Increased costs continue to form a serious problem that will be met by a series 
of actions, including the institution of page charges and of paid advertising. 

I thank the Associate Editors for their skillful and difficult work, which 
forms an essential part of our editorial procedure. Mrs. Cynthia Zilliae has 
transcended the literal duties of her position as Editorial Assistant in setting 
up and carrying on the activities of my editorial office; I am deeply grateful 
to her. 

My thanks also go to the University of Chicago for its considerable material 
aid. Finally, it is a pleasure to list the names of the referees of papers for which 
final editorial decisions have been made during the period December 1958 to 
January 1960, inclusive. The work of referees forms the core of Annals editorial 
activity, and I regret that this public expression of thanks can perforce only 
take the form of a listing. 





~*~ 
“< * 


1 bo ro 


— 
- 


wo 


— 
~ 


Smax2o 


re gO me 


oe 
— 


ome 


Same 


PORS o> 


H 
D 
8 

K 
B. 
A. 
WwW 
J. 
L. 


Suz sH 


. Addelman 
. W. Anderson 


J. Anscombe 
R. Bahadur 
Bancroft 

W. Barankin 


i, A. Barnard 
;. Baxter 


M. L. Beale 


. E. Beard 


E. Bechhofer 


. Benkhard 
. Billingsley 


Birnbaum 
Blackwell 

R. Blum 

M. Blumenthal 
R. Blyth 

C. Bose 


i. E. P. Box 


Bradt 
J. Buehler 


. L. Burkholder 
. C. Carver 

. G. Chapman 

. 8. Chern 

. L. Chung 


Clarke 
Cohen, Jr. 
. 8. Connor 
Cornfeld 
Cote 


. R. Cox 
. Curnow 


. E. Daniels 


. A. Darling 
. A. David 


. T. David 


. Derman 
. 8. Dhruvarajan 


L. Doob 


. Dubins 


. Dwass 
8S. Dwyer 


. Elfving 


Feldman 
Ferguson 

L. Folks 

G. Foster 

. Fox 

A. 8. Fraser 
. Gautschi 


NEWS AND NOTICES 


M. O. Glasgow 
R. Gnanadesikan 
H. J. Godwin 
K. Goldberg 

1. J. Good 

L. A. Goodman 
B. Greenberg 
U. Grenander 
8. 5. Gupta 

J. Gurland 

M. Halperin 

J. M. Hammersley 
J. Hannan 

J. L. Hodges, Jr. 
Wassily Hoeffding 
A. James 

G. M. Jenkins 
M. Johns 

N. L. Johnson 
G. Kallianpur 
S. Karlin 

L. Katz 

O. Kempthorne 
M. G. Kendall 
J. Kiefer 

C. Kraft 

P. R. Krishnaiah 
M. D. Kruskal 
8S. Kullback 

L. LeCam 

D. V. Lindley 
E. Lukacs 

C. L. Mallows 
J. Mandel 

H. B. Mann 

A. W. Marshall 
J. G. Mauldon 
J.C. P. Miller 
R. G. Miller 

L. Moses 

H. McKean 
Morris Newman 
C. J. Nesbitt 

P. Nidditch 

G. Noether 

I. Olkin 

D. B. Owen 

E. Parzen 

G. W. Pegiar 
W. Perks 

R. L. Plackett 
N. W. Please 


John Pratt 
Ronald Pyke 
M. F. Quenouille 
R. Radner 

I. Rabinowitz 
H. Raiffa 

M. M. Rao 

8. K. Ray 
Edgar Reich 

J. Riordan 
Herbert Robbins 
D. Robson 

M. Rosenblatt 
8. N. Roy 
Harold Ruben 
I. R. Savage 

L. J. Savage 

E. L. Seott 

B. V. Shah 

8. 8. Shrikhande 
M. M. Siddiqui 
R. Sitgreaves 

J. G. Skellam 
D. Slepian 

L. J. Snell 

C. M. Stein 

P. Suppes 

J.C. Tanner 

R. F. Tate 
Mary N. Torrey 
D. R. Truax 

J. W. Tukey 
Hans Ury 

M. Van Aarde 
H. R. Van der Vaart 
W. Vogel 

D. L. Wallace 
G. 8. Watson 
Lionel Weiss 

B. L. Welch 
James Wendel 
R. F. White 

R. A. Wijsman 
M. B. Wilk 

J. Wolfowitz 

A. Wortham 

M. Zelen 

G. Zyskind 


February 1, 1960 


William Kruskal, Editor 





NEWS AND NOTICES 


PUBLICATIONS RECEIVED 


Hogg, Robert V., and Craig, Allen T., Introduction to Mathematical Statistics, The Mac- 
millan Co., New York, New York, 1959, 245 pp., $6.75. 

Moore, John T., Fundamental Principles of Mathematics, Rinehart and Co., Inc., New York, 
New York, 1960, $7.00. 

U.N. Direction of International Trade, Vol. X, No.8, Columbia University Press, New York, 
New York, 1959, $2.50. 





BOUNDARY PROBLEMS IN DIFFERENTIAL 
EQUATIONS 


edited by Rudolph E. Langer 

BOUNDARY PROBLEMS IN DIFFERENTIAL EQUATIONS is com- 
prised of the nineteen papers which were delivered ct The 
Symposium on Boundary Problems in Differential Equations con- 
ducted by the Mathematics Research Center, United States Army, 
at the University of Wisconsin, Madison, Aprit 20-22, 1959. 
$4.00 


ON NUMERICAL APPROXIMATION 

edited by Rudolph E. Langer 

ON NUMERICAL APPROXIMATION is comprised of the twenty- 
one papers which were delivered at The Symposium on Numerical 
Approximation conducted by the Mathematics Research Center, 
United States Army, at the University of Wisconsin in April, 1958. 
The objective of this symposium was the presentation and discus- 
sion of recent developments in the field of numerical approximation. 
The papers are centered around three general themes: Linear 
Approximation, Extermal Approximation, and Algorithms. $4.50 


The University of Wisconsin Press 
430 Sterling Court Madison 6, Wisconsin 





ECONOMETRICA 


Journal of the Econometric Society 
Contents of Vol. 28, No. 1 - January 1960 


Rosert Eisner: A Distributed Lag Investment Function 
Cgeose >. Dantzic: On the Significance of Solving Linear Programming Problems with some Integer 
aria 

T. Kioex anv L. B. M. Mennes: Simultaneous-Equations Estimation based on Principal Components of 
Predetermined Variables 

H. 8. Hournaxxer: The Capacity Method of Gatat Programming 

A. Gnosn: Input-Output Analysis with Substantially Independent oon of Industries 

A. G. Doe ann A. H. Lawn: An Automatic Method of Solving Discrete sagas Problems 

R. L. Basmawn: On the Asymptotic Distribution of an Infinite Class of Generalized Linear Estimators 

W. V. Canvier: A Short-Cut Method for the ae Solution of Game Theory and Feed- Mix Problems 

Georee H. Borts: Estimation of Rail Cost Functio: 

Marc Neriove: The Market Demand for DP Durable Goods: A Comment 


Report or tHe Bi.nao Meerines 


Boox Reviews 


A Critique of it United States Income and Product Accounts (Conference on Research in Income and Wealth). 
Review by L. 8. Berman. The Life of Knut Wicksell (Torsten Gardiund), Review by 

Linear P ee and Associated Techniques (V. Riley and 8. 1. Gass). Review by "adalberto Predetti. 
Essays in the Theory of Economic Growth (Eveey Domar). view by W. Krelle. A Short-Term Planning Model 
for India (N. V. A. Narasinham). Review by Tadeusz Krause. Economics in the United States of America 
(Vining). Review by Hans Peter. Wirtechaftemechnik (W. G. Waffenschmidt). Review by Frank Ferschi 
Output, Labour and Capital in the Canadian Economy (Hood and Seott). Review by J. B. Heath. Linear Pro- 
gramming Methods (Heady and Candler). Review by Michel Verhulst. 


ANNOUNCEMENTS AND NOTES. 





JOURNAL OF 
THE AMERICAN STATISTICAL ASSOCIATION 
Volume 55 March, 1960 Number 289 


Borts, George H.: Regional Cycles of Manufacturing Empioyment in the United States, 1919-1953. 
Bryant, E. C., Hartley, H. O., and Jessen, R. J.: Design and Estimation in Two-way Stratification. 
Cohen, A. Clifford, Jr.: Estimating the Parameters of a Modified Poisson Distribution. 


Colton, Theodore: A Test Procedure with a Sample from a Normal Population when an Upper Bound to the 
Standard Deviation is Known. 


Fitzpatrick, Paul J.: Leading British Statisticians of the Nineteenth Century. 

Halperin, Maz: Extension of the Wileoxon-Mann-Whitney Test to Samples Censored at the Same Fixed Point. 
Khatri, C. G.: On Testing the Equality of Parameters in k Rectangular Populations. 

Rider, Paul R.: Variance of the Median of Small Samples from Several Special Populations. 

Schnore, Leo F .: Three Sources of Data on Commuting: Problems and Possibilities. 


Smuts, Robert W.: The Female Labor Force: A Case Study in the Interpretation of Historical Statistics. 
Tukey, John W.: Where Do We Go From Here? 


Wharton, Clifton R., Jr.: Processing Underdeveloped Data from an Underdeveloped Area. 
BOOK REVIEWS. 


AMERICAN STATISTICAL ASSOCIATION 
Beacon Building 1737 K Street, N.W. Washington 6, D.C. 





INTERNATIONAL JOURNAL OF ABSTRACTS 
STATISTICAL THEORY AND METHOD 


The aim of this new Journal is to give complete coverage of papers in the field of statistical theory (including 
associated aspects and probability and other mathematical methods) and new contributions to statistical method 
as published after Ist October 1958. 


All contributions in the following five journals—being wholly devoted to this field—will be abstracted: Annals 
of Mathematical Statistics; Biometrika; Journal, Royal Statistical Society (Series B); Bulletin of Mathematical Statistics ; 
Annals, Institute of Statistical Mathematics; and a further group of six journals will be abstracted on a virtually com- 
plete basis as follows: Biometrics; Metrika; Metron; Review, International Statistical Institute; Technometrics ; Sankhya 


There are about 250 other journals partly devoted to statistical theory and method from which the appropriate papers 
will be abstracted. 


A scheme of classification has been developed for the abstracts that is flexible and facilitates the transfer of code 
numbers to punched-cards. A unique aspect of this Journal is that the pages are colour-tinteci according to the 
main sections of classification. Tais method of colour-coding the pages provides a distinctive and poweiful visual 
aid in the identification of abstracts in whatever manner the Journal is filed for reference. 


The abstracts will be ehout 400 words long--the recommendation of UNESCO for the “‘long’’ abstract service 


—and will be in the English language. This new Journal will be quarterly and contain approrimately 1000 
abstracts per year. 


Annual Subscription £5 (U.S.A. & CANADA $16.00) 
Single Number 30s. (U.S.A. & CANADA §$ 4.50) 


OLIVER AND BOYD LTD. 
Tweeddale Court, 14 High Street, Edinburgh, 1 








Announcing a new series of books 


Proceedings of Symposia in Pure Mathematics 
Volume | 


FINITE GROUPS 


The eleven articles in this book are texts of addreases which were delivered at a symposium 
held in April, 1959. The discussions et the symposium were lively and served to indicate an 
enormous renewed interest in one of the oldest branches of algebra. The major new results 
in the field which are brought out in this book should serve to stimulate research activity in 
the Theory of Groups, one of the most beautiful subjects of mathematics. 


The authors contributing papers to this book are: 


J. G. Tuompson Wavrer Ferr Mico Suzvuxi 
R. C. Lywvow Mansuats. Haw, Jr. W. E. Desxins 
Dante. Gorenstein Dante. Hvuones Hans Zassennace 
H. 8. M. Coxerer Witnetm Maonvs 


120 pages 25% discount to members $3.90 


American Mathematical Society 
190 Hope Street 
Providence 6, Rhode Island 





TRABAJOS DE ESTADISTICA 


Review published by “‘Instituto de Investigaciones Estadisticas” of the ‘‘Consejo 
Superior de Investagaciones Cientfficas.’” Madrid, Spain. 


Vol. X CONTENTS Cuaderno II 


F. Azonix Algunos problemas estadfsticos en la construccion de escalas de consumo. 
D. E. Barton &. F. N. Davin A collector's problem. 
J. Tatacxo Or Stochastic linear inequalities. 
NOTAS. 

A. Martin Asin Anflisis estadistico de las discordancias en una red de nivelacién de Alta Precisién. 


J. pe Angespacocnaca Y R. ve Miove. 
Estudio base para el plan de previsi6n estadistica en la produceién hidroeléctrica espafiola. 


CRONICAS BIBLJOGRAFIA CUESTIONES Y EJERCICIOS. 


For everything in connection with works, exchanges and subscription write to 
de Investigaciones Estadisticas, Consejo Su i 

Spain. The Review is com cf three 2 

price is 100 pesetas for South America 











* * - ° . . 
- . * . * 
7 * *. 7 . 
. . . - 


TENTATIVE SCHEDULE 


MEETINGS OF THE INSTITUTE 
CENTRAL REGIONAL MEETING—I afayette, Indiana, April 7-9, 196. 
EASTERN REGIONAL MEETING—New York City, Apeil 21-23, 1960. 
ANNUAL MEETING—Stanford, California, August 23-86, 1960. 


cai 


fF 
Z 
i 


i 


| 





o 
a 
ny 
2 
o 
n 
» 
A = 
2 - Fi 
, 
o 
. 
“ 
. 
o 





